How to read XML file in Java
In this tutorial, we show how to read and parse an XML file using DOM parser provided by JDK.
If you’re interested in SAX or STAX parser, please refer to these tutorials: SAX parser, STAX parser.
1- Students.xml
Consider we have the following Students.xml file:
1 2 3 4 5 6 7 8 9 10 | <students> <student graduated="yes"> <id>1</id> <name>Hussein</name> </student> <student> <id>2</id> <name>Alex</name> </student> </students> |
2- Instantiate XML file
DOM parser loads the whole XML document into the memory and considers every XML tag as an element.
In order to instantiate a new Document object from an XML file, we do the following:
1 2 3 4 | File xmlFile = new File("students.xml"); DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); Document doc = builder.parse(xmlFile); |
This is done only once and all the parsing operations afterwards are done on Document object.
3- Get Root Node
To get the root node or element of an XML file, use the following:
1 | doc.getDocumentElement() |
In students.xml, the root node is: students.
4- Get All Nodes
To retrieve all nodes of a specific tag name, use getElementsByTagName() method.
In the following example, we parse students.xml and print out all the defined students.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | private static void getAllStudents(Document doc) { NodeList studentNodes = doc.getElementsByTagName("student"); for(int i=0; i<studentNodes.getLength(); i++) { Node studentNode = studentNodes.item(i); if(studentNode.getNodeType() == Node.ELEMENT_NODE) { Element studentElement = (Element) studentNode; String studentId = studentElement.getElementsByTagName("id").item(0).getTextContent(); String studentName = studentElement.getElementsByTagName("name").item(0).getTextContent(); System.out.println("Student Id = " + studentId); System.out.println("Student Name = " + studentName); } } } |
where ELEMENT_NODE type refers to a non-text node which have sub-elements.
Calling the above method would give the following output:
1 2 3 4 | Student Id = 1 Student Name = Hussein Student Id = 2 Student Name = Alex |
In order to parse the whole XML file starting from the root node, you can recursively call getChildNodes() method as the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | private static void parseWholeXML(Node startingNode) { NodeList childNodes = startingNode.getChildNodes(); for(int i=0; i<childNodes.getLength(); i++) { Node childNode = childNodes.item(i); if(childNode.getNodeType() == Node.ELEMENT_NODE) { parseWholeXML(childNode); } else { // trim() is used to ignore new lines and spaces elements. if(!childNode.getTextContent().trim().isEmpty()) { System.out.println(childNode.getTextContent()); } } } } |
In this example, we parse students.xml file and print out the text elements.
Running the above method would give the following output:
1 2 3 4 | 1 Hussein 2 Alex |
5- Get Node by text value
In order to search for a node by its value, you can use getElementsByTagName() method and check on the value of its text element.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | private static void getStudentById(Document doc, String textNodeName, String textNodeValue) { NodeList studentNodes = doc.getElementsByTagName("student"); for(int i=0; i<studentNodes.getLength(); i++) { Node studentNode = studentNodes.item(i); if(studentNode.getNodeType() == Node.ELEMENT_NODE) { Element studentElement = (Element) studentNode; NodeList textNodes = studentElement.getElementsByTagName(textNodeName); if(textNodes.getLength() > 0) { if(textNodes.item(0).getTextContent().equalsIgnoreCase(textNodeValue)) { System.out.println(textNodes.item(0).getTextContent()); System.out.println(studentElement.getElementsByTagName("name").item(0).getTextContent()); } } } } } |
In this example, we are looking for the student who has a specific id.
Now, if we call the method as:
1 | getStudentById(doc,"id", "2"); |
we get the following output:
1 2 | 2 Alex |
6- Get Node by attribute value
In order to search for a node by the value of a specific attribute, you can use getElementsByTagName() along with getAttribute() methods as the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | private static void getGraduatedStudents(Document doc, String attributeName, String attributeValue) { NodeList studentNodes = doc.getElementsByTagName("student"); for(int i=0; i<studentNodes.getLength(); i++) { Node studentNode = studentNodes.item(i); if(studentNode.getNodeType() == Node.ELEMENT_NODE) { Element studentElement = (Element) studentNode; if(attributeValue.equalsIgnoreCase(studentElement.getAttribute(attributeName))) { String studentId = studentElement.getElementsByTagName("id").item(0).getTextContent(); String studentName = studentElement.getElementsByTagName("name").item(0).getTextContent(); System.out.println("Student Id = " + studentId); System.out.println("Student Name = " + studentName); } } } } |
In this example, we are looking for all graduated students i.e. students who have (graduated=”yes”) attribute.
Now, if we call the method as:
1 | getGraduatedStudents(doc, "graduated", "yes"); |
We get the following output:
1 2 | Student Id = 1 Student Name = Hussein |
7- Source Code
You can download the source code from this repository: Read-XML
P.S: refer to How to create XML file in Java in order to check how to create a new XML file programatically using DOM parser.