本文共 8592 字,大约阅读时间需要 28 分钟。
scala解析xml
XML is a form of semi structured data which is organized in the form of trees. Semi structured data is helpful when you serialize the program data for saving in a file or shipping across a network. It defines a standardized document which is easy to read an interpret. XML stands for eXtensible Markup Language.
XML是半结构化数据的一种形式,它以树的形式进行组织。 当您序列化程序数据以保存在文件中或通过网络传送时,半结构化数据很有用。 它定义了易于阅读解释的标准化文档。 XML代表可扩展标记语言 。
XML consists of two basic elements text and tags. Text is a sequence of characters. Tags consists of a less than sign alphanumeric character and greater than sign. An end tag is same as start tag except that it consists of a slash in the end. Start tag and end tag must have the same label.
XML由文本和标签这两个基本元素组成。 文本是一个字符序列。 标签由小于符号的字母数字字符和大于符号的字符组成。 结束标记与开始标记相同,不同之处在于它在结尾处包含一个斜杠。 起始标签和结束标签必须具有相同的标签。
For example;
例如;
4
Above is valid XML as the start and end tag match each other.
上面是有效的XML,因为开始和结束标记相互匹配。
6 7
Above is invalid XML as the end tag is not specified.
上面是无效的XML,因为未指定结束标记。
8
Above XML is also invalid because the standard tag which is the child should be closed first and then the parent tag school should be closed.
上面的XML也是无效的,因为应该先关闭作为孩子的标准标签,然后再关闭父标签学校。
Since tags have to be matched, XML are structured as nested elements. The start and end tags forms a pair of matching elements and elements can be nested within each other. In the above example standard is the nested element.
由于必须匹配标签,因此XML被构造为嵌套元素。 开始和结束标签形成一对匹配元素,并且元素可以彼此嵌套。 在上面的示例中,标准是嵌套元素。
The shorthand notation which is the start tag followed by the slash indicates the start and end tag. One tag with a slash indicates an empty element.
缩写符号是开始标签,后跟斜杠,表示开始标签和结束标签。 一个带斜杠的标记表示一个空元素。
For instance in below XML standard
is an empty element.
例如,在下面的XML standard
是一个空元素。
Start tags can have attributes. An attribute is a name value pair with an equal sign in the middle. The attribute is surrounded by double quotes or single quotes.
开始标签可以具有属性。 属性是中间带有等号的名称/值对。 该属性用双引号或单引号引起来。
For instance
例如
Now that we have a brief knowledge of XML, let’s look over different things we can do in Scala for XML processing.
现在,我们对XML有一个简短的了解,下面让我们看一下我们可以在Scala中进行XML处理的不同事情。
Type a start tag and then continue writing the XML content. The XML contents are read until the end tag is seen.
键入开始标记,然后继续编写XML内容。 读取XML内容,直到看到结束标记。
For example, Open the Scala REPL shell and execute the code as
例如,打开Scala REPL shell并执行以下代码
Scala is a functional Programming language
Scala expression can be evaluated in the tag value using curly braces. For example;
可以使用花括号在标记值中评估Scala表达。 例如;
{"hi"+",Reena"}
Output: res1: scala.xml.Elem = <a> hi,Reena </a>
输出 :res1:scala.xml.Elem = <a>嗨,雷娜</a>
A brace escape can include arbitrary scala content including XML literals. For example;
大括号转义可以包含任意scala内容,包括XML文字。 例如;
val marks = 78 { if ( marks < 80){marks} else xml.NodeSeq.Empty }
Output: res3: scala.xml.Elem = <a> <marks> 78 </marks> </a>
输出 :res3:scala.xml.Elem = <a> <marks> 78 </ marks> </a>
The code inside the curly braces are evaluated to an XML node or a sequence of XML nodes. In the above example if the marks is less than 80 it is added to <a> element else nothing is added.
花括号内的代码被评估为一个XML节点或一系列XML节点。 在上面的示例中,如果标记小于80,则将其添加到<a>元素中,否则不添加任何内容。
The expression inside the brace is evaluated to a scala value and then converted to string and inserted as text.
大括号内的表达式将计算为标量值,然后转换为字符串并作为文本插入。
{9+40}
Output: res4: scala.xml.Elem = <a> 49 </a>
输出 :res4:scala.xml.Elem = <a> 49 </a>
The <, >, and & characters in the text will be escaped if you print the node.
如果打印节点,文本中的<,>和&字符将被转义。
{"Hello Scala"}
Output: res5: scala.xml.Elem = <a> </a>Hello Scala<a> </a>
输出 :res5:scala.xml.Elem = <a> </a>你好Scala <a> </a>
Below image shows all the above Scala XML Literals processing in scala shell.
下图显示了上述所有在Scala Shell中的Scala XML文字处理。
Serialization converts the internal data structure to XML so that the data can be stored, transmitted or reused. Use XML literals and brace escapes to convert to XML. Use the toXML
method that supports XML literals and brace escapes.
序列化将内部数据结构转换为XML,以便可以存储,传输或重用数据。 使用XML文字和大括号转义符转换为XML。 使用支持XML文字和大括号转义的toXML
方法。
For example first of all we will define Student class and create an instance of it.
例如,首先,我们将定义Student类并创建一个实例。
scala> abstract class Student { val name:String val id:Int val marks:Int override def toString = name def toXML =}scala> val stud = new Student { val name = "Rob" val id = 12 val marks =90 }scala> stud.toXMLres7: scala.xml.Elem = {name} {id} {marks} Rob 12 90
Below image shows the scala serialization process in scala shell.
下图显示了Scala Shell中的Scala序列化过程。
There are many methods available for XML classes. Let us now see a very useful method as how to extract text, sub elements and attributes.
XML类有很多可用的方法。 现在让我们看到一个非常有用的方法,即如何提取文本,子元素和属性。
Extracting Text
The text method on the XML node retrieves the text within that node. For example;提取文字
XML节点上的text方法检索该节点内的文本。 例如;scala> Scala is aprogramming
language .textOutput: res8: String = "Scala is a programming language "
Here the tags are excluded from the output.
此处,标记从输出中排除。
Extracting sub-elements
提取子元素
The sub elements are extracted by calling \\ followed by tag name. For example;
通过调用\\后跟标签名称来提取子元素。 例如;
scala>\\"section"Output:res21: scala.xml.NodeSeq = NodeSeq( C C )scala>\\"school"Output:res22:scala.xml.NodeSeq = NodeSeq( C ) C
Below image shows the above xml parsing examples in scala shell.
下图显示了Scala Shell中的上述xml解析示例。
Tag attributes are extracted using the same \ and \\ methods with an at sign (@) before the attribute name. For example;
使用相同的\和\\方法(在属性名称之前带有at符号(@))提取标记属性。 例如;
scala> val adam =Output:adam: scala.xml.Elem = scala> adam \\"@name"Output:res3: scala.xml.NodeSeq = NodeSeq(Adam)scala> adam \\"@iduct"Output:res5: scala.xml.NodeSeq = NodeSeq(12)
The XML is converted back to the internal data structure for the program to use. For example;
XML被转换回内部数据结构以供程序使用。 例如;
The Student class created during serialization process shall be used as the student class and the toXML
methods are used.
在序列化过程中创建的Student类将用作学生类,并使用toXML
方法。
scala> def fromXML(node: scala.xml.Node): Student =new Student { val name = (node \"name").text val id = (node \"id").text.toInt val marks = (node \"marks").text.toInt }
Output: fromXML: (node: scala.xml.Node)Student
输出 :fromXML:(节点:scala.xml.Node)
Now call the stud created in the serialization and print the xml content as below.
现在调用在序列化中创建的双头螺栓,并按如下所示打印xml内容。
scala> val stud = new Student { val name = "Rob" val id = 12 val marks =90 }
Now invoke toXML
method as;
现在以以下方式调用toXML
方法:
scala> val st = stud.toXMLst: scala.xml.Elem =Rob 12 90
Call the fromXML
method as;
调用fromXML
方法;
scala>fromXML(st)Output:res17: Student = Rob
The XML.saveFull
command is used to convert data to a file of bytes. The first argument is the file name to which the node is to be saved, second is the node, third is the character encoding, fourth is whether to write an XML declaration at the top that includes the character encoding and finally the fifth is the document type.
XML.saveFull
命令用于将数据转换为字节文件。 第一个参数是节点要保存到的文件名,第二个是节点,第三个是字符编码,第四个是是否在顶部写一个包含字符编码的XML声明,最后一个是文档类型。
For example;
例如;
scala> scala.xml.XML.save("stud.xml",st,"UTF-8",true,null)
We are using the st node created above in the de-serialization process.
我们在反序列化过程中使用上面创建的st节点。
Now open the stud.xml
file which stores the following contents:
现在打开存储以下内容的stud.xml
文件:
Rob 12 90
Now for loading the file we can use the load
method as;
现在,要加载文件,我们可以使用load
方法:
scala> val s1 = xml.XML.load("stud.xml")s1: scala.xml.Elem =Rob 12 90
That’s all for XML processing in Scala programming, we will look into more Scala features in coming posts.
这就是Scala编程中XML处理的全部内容,我们将在以后的文章中探讨更多Scala功能。
翻译自:
scala解析xml
转载地址:http://jdqzd.baihongyu.com/