Florida  Feeney

Florida Feeney

1662456960

Fuzi: A Fast & Lightweight XML & HTML Parser in Swift

Fuzi (斧子)

A fast & lightweight XML/HTML parser in Swift that makes your life easier. [Documentation]

Fuzi is based on a Swift port of Mattt Thompson's Ono(斧), using most of its low level implementaions with moderate class & interface redesign following standard Swift conventions, along with several bug fixes.

Fuzi(斧子) means "axe", in homage to Ono(斧), which in turn is inspired by Nokogiri (鋸), which means "saw".

简体中文 日本語

A Quick Look

let xml = "..."
// or
// let xmlData = <some NSData or Data>
do {
  let document = try XMLDocument(string: xml)
  // or
  // let document = try XMLDocument(data: xmlData)
  
  if let root = document.root {
    // Accessing all child nodes of root element
    for element in root.children {
      print("\(element.tag): \(element.attributes)")
    }
    
    // Getting child element by tag & accessing attributes
    if let length = root.firstChild(tag:"Length", inNamespace: "dc") {
      print(length["unit"])     // `unit` attribute
      print(length.attributes)  // all attributes
    }
  }
  
  // XPath & CSS queries
  for element in document.xpath("//element") {
    print("\(element.tag): \(element.attributes)")
  }
  
  if let firstLink = document.firstChild(css: "a, link") {
    print(firstLink["href"])
  }
} catch let error {
  print(error)
}

Features

Inherited from Ono

  • Extremely performant document parsing and traversal, powered by libxml2
  • Support for both XPath and CSS queries
  • Automatic conversion of date and number values
  • Correct, common-sense handling of XML namespaces for elements and attributes
  • Ability to load HTML and XML documents from either String or NSData or [CChar]
  • Comprehensive test suite
  • Full documentation

Improved in Fuzi

  • Simple, modern API following standard Swift conventions, no more return types like AnyObject! that cause unnecessary type casts
  • Customizable date and number formatters
  • Some bugs fixes
  • More convenience methods for HTML Documents
  • Access XML nodes of all types (Including text, comment, etc.)
  • Support for more CSS selectors (yet to come)

Requirements

  • iOS 8.0+ / Mac OS X 10.9+
  • Xcode 8.0+

Use version 0.4.0 for Swift 2.3.

Installation

There are 4 ways you can install Fuzi to your project.

Using CocoaPods

You can use CocoaPods to install Fuzi by adding it to your to your Podfile:

platform :ios, '8.0'
use_frameworks!

target 'MyApp' do
    pod 'Fuzi', '~> 1.0.0'
end

Then, run the following command:

$ pod install

Using Swift Package Manager

The Swift Package Manager is now built-in with Xcode 11 (currently in beta). You can easily add Fuzi as a dependency by choosing File > Swift Packages > Add Package Dependency... or in the Swift Packages tab of your project file and clicking on +. Simply use https://github.com/cezheng/Fuzi as repository and Xcode should automatically resolve the current version.

Manually

  1. Add all *.swift files in Fuzi directory into your project.
  2. In your Xcode project Build Settings:
    1. Find Search Paths, add $(SDKROOT)/usr/include/libxml2 to Header Search Paths.
    2. Find Linking, add -lxml2 to Other Linker Flags.

Using Carthage

Create a Cartfile or Cartfile.private in the root directory of your project, and add the following line:

github "cezheng/Fuzi" ~> 1.0.0

Run the following command:

$ carthage update

Then do the followings in Xcode:

  1. Drag the Fuzi.framework built by Carthage into your target's General -> Embedded Binaries.
  2. In Build Settings, find Search Paths, add $(SDKROOT)/usr/include/libxml2 to Header Search Paths.

Usage

XML

import Fuzi

let xml = "..."
do {
  // if encoding is omitted, it defaults to NSUTF8StringEncoding
  let document = try XMLDocument(string: html, encoding: String.Encoding.utf8)
  if let root = document.root {
    print(root.tag)
    
    // define a prefix for a namespace
    document.definePrefix("atom", defaultNamespace: "http://www.w3.org/2005/Atom")
    
    // get first child element with given tag in namespace(optional)
    print(root.firstChild(tag: "title", inNamespace: "atom"))

    // iterate through all children
    for element in root.children {
      print("\(index) \(element.tag): \(element.attributes)")
    }
  }
  // you can also use CSS selector against XMLDocument when you feels it makes sense
} catch let error as XMLError {
  switch error {
  case .noError: print("wth this should not appear")
  case .parserFailure, .invalidData: print(error)
  case .libXMLError(let code, let message):
    print("libxml error code: \(code), message: \(message)")
  }
}

HTML

HTMLDocument is a subclass of XMLDocument.

import Fuzi

let html = "<html>...</html>"
do {
  // if encoding is omitted, it defaults to NSUTF8StringEncoding
  let doc = try HTMLDocument(string: html, encoding: String.Encoding.utf8)
  
  // CSS queries
  if let elementById = doc.firstChild(css: "#id") {
    print(elementById.stringValue)
  }
  for link in doc.css("a, link") {
      print(link.rawXML)
      print(link["href"])
  }
  
  // XPath queries
  if let firstAnchor = doc.firstChild(xpath: "//body/a") {
    print(firstAnchor["href"])
  }
  for script in doc.xpath("//head/script") {
    print(script["src"])
  }
  
  // Evaluate XPath functions
  if let result = doc.eval(xpath: "count(/*/a)") {
    print("anchor count : \(result.doubleValue)")
  }
  
  // Convenient HTML methods
  print(doc.title) // gets <title>'s innerHTML in <head>
  print(doc.head)  // gets <head> element
  print(doc.body)  // gets <body> element
  
} catch let error {
  print(error)
}

I don't care about error handling

import Fuzi

let xml = "..."

// Don't show me the errors, just don't crash
if let doc1 = try? XMLDocument(string: xml) {
  //...
}

let html = "<html>...</html>"

// I'm sure this won't crash
let doc2 = try! HTMLDocument(string: html)
//...

I want to access Text Nodes

Not only text nodes, you can specify what types of nodes you would like to access.

let document = ...
// Get all child nodes that are Element nodes, Text nodes, or Comment nodes
document.root?.childNodes(ofTypes: [.Element, .Text, .Comment])

Migrating From Ono?

Looking at example programs is the swiftest way to know the difference. The following 2 examples do exactly the same thing.

Ono Example

Fuzi Example

Accessing children

Ono

[doc firstChildWithTag:tag inNamespace:namespace];
[doc firstChildWithXPath:xpath];
[doc firstChildWithXPath:css];
for (ONOXMLElement *element in parent.children) {
  //...
}
[doc childrenWithTag:tag inNamespace:namespace];

Fuzi

doc.firstChild(tag: tag, inNamespace: namespace)
doc.firstChild(xpath: xpath)
doc.firstChild(css: css)
for element in parent.children {
  //...
}
doc.children(tag: tag, inNamespace:namespace)

Iterate through query results

Ono

Conforms to NSFastEnumeration.

// simply iterating through the results
// mark `__unused` to unused params `idx` and `stop`
[doc enumerateElementsWithXPath:xpath usingBlock:^(ONOXMLElement *element, __unused NSUInteger idx, __unused BOOL *stop) {
  NSLog(@"%@", element);
}];

// stop the iteration at second element
[doc enumerateElementsWithXPath:XPath usingBlock:^(ONOXMLElement *element, NSUInteger idx, BOOL *stop) {
  *stop = (idx == 1);
}];

// getting element by index 
ONOXMLDocument *nthElement = [(NSEnumerator*)[doc CSS:css] allObjects][n];

// total element count
NSUInteger count = [(NSEnumerator*)[document XPath:xpath] allObjects].count;

Fuzi

Conforms to Swift's SequenceType and Indexable.

// simply iterating through the results
// no need to write the unused `idx` or `stop` params
for element in doc.xpath(xpath) {
  print(element)
}

// stop the iteration at second element
for (index, element) in doc.xpath(xpath).enumerate() {
  if idx == 1 {
    break
  }
}

// getting element by index 
if let nthElement = doc.css(css)[n] {
  //...
}

// total element count
let count = doc.xpath(xpath).count

Evaluating XPath Functions

Ono

ONOXPathFunctionResult *result = [doc functionResultByEvaluatingXPath:xpath];
result.boolValue;    //BOOL
result.numericValue; //double
result.stringValue;  //NSString

Fuzi

if let result = doc.eval(xpath: xpath) {
  result.boolValue   //Bool
  result.doubleValue //Double
  result.stringValue //String
}

License

Fuzi is released under the MIT license. See LICENSE for details.


Download Details:

Author: cezheng
Source code: https://github.com/cezheng/Fuzi

License: MIT license
#swift 

What is GEEK

Buddha Community

Fuzi: A Fast & Lightweight XML & HTML Parser in Swift
Rupert  Beatty

Rupert Beatty

1666834800

Kanna: Kanna is an XML/HTML Parser for Swift

Kanna(鉋)

Kanna(鉋) is an XML/HTML parser for cross-platform(macOS, iOS, tvOS, watchOS and Linux!).

It was inspired by Nokogiri(鋸).

ℹ️ Documentation

Features

  •  XPath 1.0 support for document searching
  •  CSS3 selector support for document searching
  •  Support for namespaces
  •  Comprehensive test suite

Installation for Swift 5

CocoaPods

Add the following to your Podfile:

use_frameworks!
pod 'Kanna', '~> 5.2.2'

Carthage

Add the following to your Cartfile:

github "tid-kijyun/Kanna" ~> 5.2.2

For xcode 11.3 and earlier, the following settings are required.

In the project settings add $(SDKROOT)/usr/include/libxml2 to the "header search paths" field

Swift Package Manager

Installing libxml2 to your computer:

// macOS: For xcode 11.3 and earlier, the following settings are required.
$ brew install libxml2
$ brew link --force libxml2

// Linux(Ubuntu):
$ sudo apt-get install libxml2-dev
  1. Add the following to your Package.swift:
// swift-tools-version:5.0
import PackageDescription

let package = Package(
    name: "YourProject",
    dependencies: [
        .package(url: "https://github.com/tid-kijyun/Kanna.git", from: "5.2.2"),
    ],
    targets: [
        .target(
            name: "YourTarget",
            dependencies: ["Kanna"]),
    ]
)
$ swift build

Note: When a build error occurs, please try run the following command:

// Linux(Ubuntu)
$ sudo apt-get install pkg-config

Manual Installation

  1. Add these files to your project:
    Kanna.swift
    CSS.swift
    libxmlHTMLDocument.swift
    libxmlHTMLNode.swift
    libxmlParserOption.swift
    Modules
  2. In the target settings add $(SDKROOT)/usr/include/libxml2 to the Search Paths > Header Search Paths field
  3. In the target settings add $(SRCROOT)/Modules to the Swift Compiler - Search Paths > Import Paths field

Installation for swift 4

Installation for swift 3

Synopsis

import Kanna

let html = "<html>...</html>"

if let doc = try? HTML(html: html, encoding: .utf8) {
    print(doc.title)
    
    // Search for nodes by CSS
    for link in doc.css("a, link") {
        print(link.text)
        print(link["href"])
    }
    
    // Search for nodes by XPath
    for link in doc.xpath("//a | //link") {
        print(link.text)
        print(link["href"])
    }
}
let xml = "..."
if let doc = try? Kanna.XML(xml: xml, encoding: .utf8) {
    let namespaces = [
                    "o":  "urn:schemas-microsoft-com:office:office",
                    "ss": "urn:schemas-microsoft-com:office:spreadsheet"
                ]
    if let author = doc.at_xpath("//o:Author", namespaces: namespaces) {
        print(author.text)
    }
}

Download Details:

Author: Tid-kijyun
Source Code: https://github.com/tid-kijyun/Kanna 
License: MIT license

#swift #html #xml 

Ava Watson

Ava Watson

1595318322

Know Everything About HTML With HTML Experts

HTML stands for a hypertext markup language. For the designs to be displayed in web browser HTML is the markup language. Technologies like Cascading style sheets (CSS) and scripting languages such as JavaScript assist HTML. With the help of HTML websites and the web, designs are created. Html has a wide range of academic applications. HTML has a series of elements. HTML helps to display web content. Its elements tell the web how to display the contents.

The document component of HTML is known as an HTML element. HTML element helps in displaying the web pages. An HTML document is a mixture of text nodes and HTML elements.

Basics of HTML are-

The simple fundamental components oh HTML is

  1. Head- the setup information for the program and web pages is carried in the head
  2. Body- the actual substance that is to be shown on the web page is carried in the body
  3. HTML- information starts and ends with and labels.
  4. Comments- come up in between

Html versions timeline

  1. HTML was created in 1990. Html is a program that is updated regularly. the timeline for the HTML versions is
  2. HTML 2- November, 1995
  3. HTML 3- January, 1997
  4. HTML 4- December, 1997; April, 1998; December, 1999; May, 2000
  5. HTML 5- October, 2014; November, 2016; December, 2017

HTML draft version timelines are

  1. October 1991
  2. June 1992
  3. November 1992
  4. June 1993
  5. November 1993
  6. November 1994
  7. April 1995
  8. January 2008
  9. HTML 5-
    2011, last call
    2012 candidate recommendation
    2014 proposed recommendation and recommendation

HTML helps in creating web pages. In web pages, there are texts, pictures, colouring schemes, tables, and a variety of other things. HTML allows all these on a web page.
There are a lot of attributes in HTML. It may get difficult to memorize these attributes. HTML is a tricky concept. Sometimes it gets difficult to find a single mistake that doesn’t let the web page function properly.

Many minor things are to be kept in mind in HTML. To complete an HTML assignment, it is always advisable to seek help from online experts. These experts are well trained and acknowledged with the subject. They provide quality content within the prescribed deadline. With several positive reviews, the online expert help for HTML assignment is highly recommended.

#html assignment help #html assignment writing help #online html assignment writing help #html assignment help service online #what is html #about html

Rupert  Beatty

Rupert Beatty

1672896780

A Fast, Lightweight XML & HTML Parser in Swift with XPath, CSS Support

Fuzi (斧子)

A fast & lightweight XML/HTML parser in Swift that makes your life easier. [Documentation]

Fuzi is based on a Swift port of Mattt Thompson's Ono(斧), using most of its low level implementaions with moderate class & interface redesign following standard Swift conventions, along with several bug fixes.

Fuzi(斧子) means "axe", in homage to Ono(斧), which in turn is inspired by Nokogiri (鋸), which means "saw".

简体中文 日本語

A Quick Look

let xml = "..."
// or
// let xmlData = <some NSData or Data>
do {
  let document = try XMLDocument(string: xml)
  // or
  // let document = try XMLDocument(data: xmlData)
  
  if let root = document.root {
    // Accessing all child nodes of root element
    for element in root.children {
      print("\(element.tag): \(element.attributes)")
    }
    
    // Getting child element by tag & accessing attributes
    if let length = root.firstChild(tag:"Length", inNamespace: "dc") {
      print(length["unit"])     // `unit` attribute
      print(length.attributes)  // all attributes
    }
  }
  
  // XPath & CSS queries
  for element in document.xpath("//element") {
    print("\(element.tag): \(element.attributes)")
  }
  
  if let firstLink = document.firstChild(css: "a, link") {
    print(firstLink["href"])
  }
} catch let error {
  print(error)
}

Features

Inherited from Ono

  • Extremely performant document parsing and traversal, powered by libxml2
  • Support for both XPath and CSS queries
  • Automatic conversion of date and number values
  • Correct, common-sense handling of XML namespaces for elements and attributes
  • Ability to load HTML and XML documents from either String or NSData or [CChar]
  • Comprehensive test suite
  • Full documentation

Improved in Fuzi

  • Simple, modern API following standard Swift conventions, no more return types like AnyObject! that cause unnecessary type casts
  • Customizable date and number formatters
  • Some bugs fixes
  • More convenience methods for HTML Documents
  • Access XML nodes of all types (Including text, comment, etc.)
  • Support for more CSS selectors (yet to come)

Requirements

  • iOS 8.0+ / Mac OS X 10.9+
  • Xcode 8.0+

Use version 0.4.0 for Swift 2.3.

Installation

There are 4 ways you can install Fuzi to your project.

Using CocoaPods

You can use CocoaPods to install Fuzi by adding it to your to your Podfile:

platform :ios, '8.0'
use_frameworks!

target 'MyApp' do
    pod 'Fuzi', '~> 1.0.0'
end

Then, run the following command:

$ pod install

Using Swift Package Manager

The Swift Package Manager is now built-in with Xcode 11 (currently in beta). You can easily add Fuzi as a dependency by choosing File > Swift Packages > Add Package Dependency... or in the Swift Packages tab of your project file and clicking on +. Simply use https://github.com/cezheng/Fuzi as repository and Xcode should automatically resolve the current version.

Manually

  1. Add all *.swift files in Fuzi directory into your project.
  2. In your Xcode project Build Settings:
    1. Find Search Paths, add $(SDKROOT)/usr/include/libxml2 to Header Search Paths.
    2. Find Linking, add -lxml2 to Other Linker Flags.

Using Carthage

Create a Cartfile or Cartfile.private in the root directory of your project, and add the following line:

github "cezheng/Fuzi" ~> 1.0.0

Run the following command:

$ carthage update

Then do the followings in Xcode:

  1. Drag the Fuzi.framework built by Carthage into your target's General -> Embedded Binaries.
  2. In Build Settings, find Search Paths, add $(SDKROOT)/usr/include/libxml2 to Header Search Paths.

Usage

XML

import Fuzi

let xml = "..."
do {
  // if encoding is omitted, it defaults to NSUTF8StringEncoding
  let document = try XMLDocument(string: html, encoding: String.Encoding.utf8)
  if let root = document.root {
    print(root.tag)
    
    // define a prefix for a namespace
    document.definePrefix("atom", defaultNamespace: "http://www.w3.org/2005/Atom")
    
    // get first child element with given tag in namespace(optional)
    print(root.firstChild(tag: "title", inNamespace: "atom"))

    // iterate through all children
    for element in root.children {
      print("\(index) \(element.tag): \(element.attributes)")
    }
  }
  // you can also use CSS selector against XMLDocument when you feels it makes sense
} catch let error as XMLError {
  switch error {
  case .noError: print("wth this should not appear")
  case .parserFailure, .invalidData: print(error)
  case .libXMLError(let code, let message):
    print("libxml error code: \(code), message: \(message)")
  }
}

HTML

HTMLDocument is a subclass of XMLDocument.

import Fuzi

let html = "<html>...</html>"
do {
  // if encoding is omitted, it defaults to NSUTF8StringEncoding
  let doc = try HTMLDocument(string: html, encoding: String.Encoding.utf8)
  
  // CSS queries
  if let elementById = doc.firstChild(css: "#id") {
    print(elementById.stringValue)
  }
  for link in doc.css("a, link") {
      print(link.rawXML)
      print(link["href"])
  }
  
  // XPath queries
  if let firstAnchor = doc.firstChild(xpath: "//body/a") {
    print(firstAnchor["href"])
  }
  for script in doc.xpath("//head/script") {
    print(script["src"])
  }
  
  // Evaluate XPath functions
  if let result = doc.eval(xpath: "count(/*/a)") {
    print("anchor count : \(result.doubleValue)")
  }
  
  // Convenient HTML methods
  print(doc.title) // gets <title>'s innerHTML in <head>
  print(doc.head)  // gets <head> element
  print(doc.body)  // gets <body> element
  
} catch let error {
  print(error)
}

I don't care about error handling

import Fuzi

let xml = "..."

// Don't show me the errors, just don't crash
if let doc1 = try? XMLDocument(string: xml) {
  //...
}

let html = "<html>...</html>"

// I'm sure this won't crash
let doc2 = try! HTMLDocument(string: html)
//...

I want to access Text Nodes

Not only text nodes, you can specify what types of nodes you would like to access.

let document = ...
// Get all child nodes that are Element nodes, Text nodes, or Comment nodes
document.root?.childNodes(ofTypes: [.Element, .Text, .Comment])

Migrating From Ono?

Looking at example programs is the swiftest way to know the difference. The following 2 examples do exactly the same thing.

Ono Example

Fuzi Example

Accessing children

Ono

[doc firstChildWithTag:tag inNamespace:namespace];
[doc firstChildWithXPath:xpath];
[doc firstChildWithXPath:css];
for (ONOXMLElement *element in parent.children) {
  //...
}
[doc childrenWithTag:tag inNamespace:namespace];

Fuzi

doc.firstChild(tag: tag, inNamespace: namespace)
doc.firstChild(xpath: xpath)
doc.firstChild(css: css)
for element in parent.children {
  //...
}
doc.children(tag: tag, inNamespace:namespace)

Iterate through query results

Ono

Conforms to NSFastEnumeration.

// simply iterating through the results
// mark `__unused` to unused params `idx` and `stop`
[doc enumerateElementsWithXPath:xpath usingBlock:^(ONOXMLElement *element, __unused NSUInteger idx, __unused BOOL *stop) {
  NSLog(@"%@", element);
}];

// stop the iteration at second element
[doc enumerateElementsWithXPath:XPath usingBlock:^(ONOXMLElement *element, NSUInteger idx, BOOL *stop) {
  *stop = (idx == 1);
}];

// getting element by index 
ONOXMLDocument *nthElement = [(NSEnumerator*)[doc CSS:css] allObjects][n];

// total element count
NSUInteger count = [(NSEnumerator*)[document XPath:xpath] allObjects].count;

Fuzi

Conforms to Swift's SequenceType and Indexable.

// simply iterating through the results
// no need to write the unused `idx` or `stop` params
for element in doc.xpath(xpath) {
  print(element)
}

// stop the iteration at second element
for (index, element) in doc.xpath(xpath).enumerate() {
  if idx == 1 {
    break
  }
}

// getting element by index 
if let nthElement = doc.css(css)[n] {
  //...
}

// total element count
let count = doc.xpath(xpath).count

Evaluating XPath Functions

Ono

ONOXPathFunctionResult *result = [doc functionResultByEvaluatingXPath:xpath];
result.boolValue;    //BOOL
result.numericValue; //double
result.stringValue;  //NSString

Fuzi

if let result = doc.eval(xpath: xpath) {
  result.boolValue   //Bool
  result.doubleValue //Double
  result.stringValue //String
}

Download Details:

Author: Cezheng
Source Code: https://github.com/cezheng/Fuzi 
License: MIT license

#swift #css #html #ios 

Wasswa  Meagan

Wasswa Meagan

1619678404

HTML Vs XML: Difference Between HTML and XML [2021]

HTML’s full form is Hypertext Markup Language, while XML is an Extensible Markup Language. The purpose of HTML is to display data and focus on how the data looks. Therefore, HTML describes a web page’s structure and displays information, whereas XML structures, stores, and transfers information and describes what the data is.

In this article, HTML and XML shall be discussed in detail to understand the differences between them.

What is HTML?

Hypertext Markup Language (HTML) is a programming language that displays data and describes a web page’s structure. Hypertext facilitates browsing the web by referring to the hyperlinks an HTML page contains. The hyperlink enables one to go to any place on the internet by clicking it. There is no set order to do so.

Markup language points out to the way tags are used in defining the page layout and the elements within the page. It consists of various HTML elements comprising tags and their content. HTML language enables the creation of links of documents, is static, and can ignore small errors. In HTML, closing tags are not necessary. It can be defined as a markup language that makes the text more dynamic and interactive.

#software development #html #html vs xml #xml

Ssekidde  Nat

Ssekidde Nat

1619518500

HTML Vs XML: Difference Between HTML and XML [2021]

HTML’s full form is Hypertext Markup Language, while XML is an Extensible Markup Language. The purpose of HTML is to display data and focus on how the data looks. Therefore, HTML describes a web page’s structure and displays information, whereas XML structures, stores, and transfers information and describes what the data is.

One-Of-Its-Kind Program That Creates Skilled Software Developers. Apply Now!

In this article, HTML and XML shall be discussed in detail to understand the differences between them.

What is HTML?

Hypertext Markup Language (HTML) is a programming language that displays data and describes a web page’s structure. Hypertext facilitates browsing the web by referring to the hyperlinks an HTML page contains. The hyperlink enables one to go to any place on the internet by clicking it. There is no set order to do so.

Markup language points out to the way tags are used in defining the page layout and the elements within the page. It consists of various HTML elements comprising tags and their content. HTML language enables the creation of links of documents, is static, and can ignore small errors. In HTML, closing tags are not necessary. It can be defined as a markup language that makes the text more dynamic and interactive.

#software development #html #html vs xml #xml