How to Format A Number To 2 Decimal Places in JavaScript

How to Format A Number To 2 Decimal Places in JavaScript

You can use the toFixed() method to format a number to 2 decimal places in JavaScript. The toFixed() method takes a number as input, representing the number of digits to appear after the decimal point, and returns a formatted string representing the given number.

const num1 = 12.865
const res1 = num1.toFixed(2)
console.log(res1) // 12.87

const num2 = 19
const res2 = num2.toFixed(2)
console.log(res2) // 19.00

In the above examples, we called the toFixed() method on the number, passing it 2 as a parameter to format the number to 2 decimal places.

The only parameter toFixed() takes is the number of digits to appear after the decimal point. The value must be between 0 and 100 inclusive. If this argument is skipped, it is treated as 0. If you pass a value greater than 100, a RangeError is thrown:

const num1 = 17.8654553
const res1 = num1.toFixed()
console.log(res1) // 18

const num2 = 55.3469
const res2 = num2.toFixed(120)
console.log(res2)
// RangeError: toFixed() digits argument must be between 0 and 100

Using toFixed() with negative numbers may produce unexpected results. You need to group the negative number expression to get a formatted string:

const res1 = -7.892.toFixed(2)
console.log(res1) // -7.89
console.log(typeof res1) // number --> ❌

const res2 = (-7.892).toFixed(2)
console.log(res2) // -7.89
console.log(typeof res2) // string -->  ✅

If the number is wrapped in a string, you need to convert the string to a number first before calling the toFixed() method:

const str = '22.567'
const res = parseFloat(str).toFixed(2)
console.log(res) // 22.57

Original article source at: https://attacomsian.com/

#javascript #number #format 

How to Format A Number To 2 Decimal Places in JavaScript

MatrixMarket.jl: Julia Package to Read MatrixMarket File format

MatrixMarket

Package to read/write matrices from/to files in the Matrix Market native exchange format.

The Matrix Market is a NIST repository of "test data for use in comparative studies of algorithms for numerical linear algebra, featuring nearly 500 sparse matrices from a variety of applications, as well as matrix generation tools and services." Over time, the Matrix Market's native exchange format has become one of the de facto standard file formats for exchanging matrix data.

Usage

Read

using MatrixMarket
M = MatrixMarket.mmread("myfile.mtx")

M will be a sparse or dense matrix depending on whether the file contains a matrix in coordinate format or array format. The specific type of M may be Symmetric or Hermitian depending on the symmetry information contained in the file header.

MatrixMarket.mmread("myfile.mtx", true)

Returns raw data from the file header. Does not read in the actual matrix elements.

Write

MatrixMarket.mmwrite("myfile.mtx", M)

M has to be a sparse matrix.

Download Details:

Author: JuliaSparse
Source Code: https://github.com/JuliaSparse/MatrixMarket.jl 
License: View license

#julia #file #format 

 MatrixMarket.jl: Julia Package to Read MatrixMarket File format

Imagineformat.jl: Read .imagine Files in Julia

ImagineFormat

Imagine is an acquisition program for light sheet microscopy written by Zhongsheng Guo in Tim Holy's lab. This package implements a loader for the file format for the Julia programming language. Each Imagine "file" consists of two parts (as two separate files): a *.imagine file which contains the (ASCII) header, and a *.cam file which contains the camera data. The *.cam file is a raw byte dump, and is compatible with the NRRD "raw" file.

Usage

Read Imagine files like this:

using Images
img = load("filename")

Converting to NRRD

You can write an NRRD header (*.nhdr) from an Imagine header as follows:

h = ImagineFormat.parse_header(filename)  # the .imagine file name
imagine2nrrd(nrrdname, h, datafilename)

where datafilename is the name of the *.cam file. It is required by the *.nhdr file to point to the actual data.

Writing Imagine headers

You can use the non-exported function ImagineFormat.save_header:

save_header(destname, srcname, img, [T])

destname is the output *.imagine file; srcname is the name of the "template" file. Certain header values (e.g., size information) are updated by reference to img. The optional argument T allows you to specify a different element type than possessed by img.

Download Details:

Author: Timholy
Source Code: https://github.com/timholy/ImagineFormat.jl 
License: View license

#julia #files #format 

Imagineformat.jl: Read .imagine Files in Julia

10 Popular PHP Libraries for Working with Markup and CSS Formats

In today's post we will learn about 10 Popular PHP Libraries for Working with Markup and CSS Formats.

What is CSS Markup?

After you've settled upon the right design for your new website, that design will need to be digitized through a process known as CSS Markup.

CSS is the language for describing the presentation of Web pages. This includes colors, layout, and font information, as well as how to change the presentation on different types of devices, such as those with large screens, small screens, or printers. CSS is independent of HTML and can be used with any XML-based markup language.

The separation of HTML from CSS makes it easier to maintain the code, share style sheets across pages, and tailor pages to different environments. This is referred to as the separation of structure (or content) from presentation.

Table of contents:

  • Cebe Markdown - An fast and extensible Markdown parser.
  • CommonMark PHP - Highly-extensible Markdown parser which fully supports the CommonMark spec.
  • Decoda - A lightweight markup parser library.
  • Essence - A library for extracting web media.
  • Embera - An Oembed consumer library.
  • HTML to Markdown - Converts HTML into Markdown.
  • HTML5 PHP - An HTML5 parser and serializer library.
  • Parsedown - Another Markdown parser.
  • PHP CSS Parser - A Parser for CSS Files written in PHP.
  • PHP Markdown - A Markdown parser.

1 - Cebe Markdown:

An fast and extensible Markdown parser.

What is this?

A set of PHP classes, each representing a Markdown flavor, and a command line tool for converting markdown files to HTML files.

The implementation focus is to be fast (see benchmark) and extensible. Parsing Markdown to HTML is as simple as calling a single method (see Usage) providing a solid implementation that gives most expected results even in non-trivial edge cases.

Extending the Markdown language with new elements is as simple as adding a new method to the class that converts the markdown text to the expected output in HTML. This is possible without dealing with complex and error prone regular expressions. It is also possible to hook into the markdown structure and add elements or read meta information using the internal representation of the Markdown text as an abstract syntax tree (see Extending the language).

Installation

PHP 5.4 or higher is required to use it. It will also run on facebook's hhvm.

The library uses PHPDoc annotations to determine the markdown elements that should be parsed. So in case you are using PHP opcache, make sure it does not strip comments.

Installation is recommended to be done via composer by running:

composer require cebe/markdown "~1.2.0"

Alternatively you can add the following to the require section in your composer.json manually:

"cebe/markdown": "~1.2.0"

Run composer update cebe/markdown afterwards.

Note: If you have configured PHP with opcache you need to enable the opcache.save_comments option because inline element parsing relies on PHPdoc annotations to find declared elements.

Usage

In your PHP project

To parse your markdown you need only two lines of code. The first one is to choose the markdown flavor as one of the following:

  • Traditional Markdown: $parser = new \cebe\markdown\Markdown();
  • Github Flavored Markdown: $parser = new \cebe\markdown\GithubMarkdown();
  • Markdown Extra: $parser = new \cebe\markdown\MarkdownExtra();

The next step is to call the parse()-method for parsing the text using the full markdown language or calling the parseParagraph()-method to parse only inline elements.

Here are some examples:

// traditional markdown and parse full text
$parser = new \cebe\markdown\Markdown();
echo $parser->parse($markdown);

// use github markdown
$parser = new \cebe\markdown\GithubMarkdown();
echo $parser->parse($markdown);

// use markdown extra
$parser = new \cebe\markdown\MarkdownExtra();
echo $parser->parse($markdown);

// parse only inline elements (useful for one-line descriptions)
$parser = new \cebe\markdown\GithubMarkdown();
echo $parser->parseParagraph($markdown);

You may optionally set one of the following options on the parser object:

For all Markdown Flavors:

  • $parser->html5 = true to enable HTML5 output instead of HTML4.
  • $parser->keepListStartNumber = true to enable keeping the numbers of ordered lists as specified in the markdown. The default behavior is to always start from 1 and increment by one regardless of the number in markdown.

For GithubMarkdown:

  • $parser->enableNewlines = true to convert all newlines to <br/>-tags. By default only newlines with two preceding spaces are converted to <br/>-tags.

It is recommended to use UTF-8 encoding for the input strings. Other encodings may work, but are currently untested.

View on Github

2 - CommonMark PHP:

Highly-extensible Markdown parser which fully supports the CommonMark spec.

league/commonmark is a highly-extensible PHP Markdown parser created by Colin O'Dell which supports the full CommonMark spec and GitHub-Flavored Markdown. It is based on the CommonMark JS reference implementation by John MacFarlane (@jgm).

📦 Installation & Basic Usage

This project requires PHP 7.4 or higher with the mbstring extension. To install it via Composer simply run:

$ composer require league/commonmark

The CommonMarkConverter class provides a simple wrapper for converting CommonMark to HTML:

use League\CommonMark\CommonMarkConverter;

$converter = new CommonMarkConverter([
    'html_input' => 'strip',
    'allow_unsafe_links' => false,
]);

echo $converter->convert('# Hello World!');

// <h1>Hello World!</h1>

Or if you want GitHub-Flavored Markdown, use the GithubFlavoredMarkdownConverter class instead:

use League\CommonMark\GithubFlavoredMarkdownConverter;

$converter = new GithubFlavoredMarkdownConverter([
    'html_input' => 'strip',
    'allow_unsafe_links' => false,
]);

echo $converter->convert('# Hello World!');

// <h1>Hello World!</h1>

Please note that only UTF-8 and ASCII encodings are supported. If your Markdown uses a different encoding please convert it to UTF-8 before running it through this library.

🔒 If you will be parsing untrusted input from users, please consider setting the html_input and allow_unsafe_links options per the example above. See https://commonmark.thephpleague.com/security/ for more details. If you also do choose to allow raw HTML input from untrusted users, consider using a library (like HTML Purifier) to provide additional HTML filtering.

🧪 Testing

$ composer test

This will also test league/commonmark against the latest supported spec.

🚀 Performance Benchmarks

You can compare the performance of league/commonmark to other popular parsers by running the included benchmark tool:

$ ./tests/benchmark/benchmark.php

View on Github

3 - Decoda:

A lightweight markup parser library.

Requirements

  • PHP 5.6.0+
    • Multibyte
  • Composer

Features

  • Parses custom code to valid (X)HTML markup
  • Setting to make links and emails auto-clickable
  • Setting to use shorthand text for links and emails
  • Filters to parse markup and custom code
  • Hooks to execute callbacks during the parsing cycle
  • Loaders to load resources and files for configuration
  • Engines to render complex markup using a template system
  • Can censor offensive words
  • Can convert smiley faces into images
  • Basic support for localized messages
  • Parser result caching
  • Supports a wide range of tags
  • Parent child node hierarchy
  • Fixes incorrectly nested tags by removing the broken/unclosed tags
  • Self closing tags
  • Logs errors for validation
  • Tag and attribute aliasing

Filters

The following filters and supported tags are available.

  • Default - b, i, u, s, sup, sub, br, hr, abbr, time
  • Block - align, float, hide, alert, note, div, spoiler, left, right, center, justify
  • Code - code, source, var
  • Email - email, mail
  • Image - image, img
  • List - list, olist, ol, ul, li, *
  • Quote - quote
  • Text - font, size, color, h1-h6
  • Url - url, link
  • Video - video, youtube, vimeo, veoh, liveleak, dailymotion, myspace, wegame, collegehumor
  • Table - table, thead, tbody, tfoot, tr, td, th, row, col

Hooks

The following hooks are available.

  • Censor - Censors all words found within config/censored
  • Clickable - Converts all non-tag wrapped URLs and emails into clickable links
  • Emoticon - Converts all smilies found within config/emoticons into emoticon images

Storage Engines

The following caching layers are supported.

  • In-Memory
  • Memcache
  • Redis

View on Github

4 - Essence:

A library for extracting web media.

Essence is a simple PHP library to extract media information from websites, like youtube videos, twitter statuses or blog articles.

If you were already using Essence 2.x.x, you should take a look at the migration guide.

Installation

composer require essence/essence

Example

Essence is designed to be really easy to use. Using the main class of the library, you can retrieve information in just those few lines:

$Essence = new Essence\Essence();
$Media = $Essence->extract('http://www.youtube.com/watch?v=39e3KYAmXK4');

if ($Media) {
	// That's all, you're good to go !
}

Then, just do anything you want with the data:

<article>
	<header>
		<h1><?php echo $Media->title; ?></h1>
		<p>By <?php echo $Media->authorName; ?></p>
	</header>

	<div class="player">
		<?php echo $Media->html; ?>
	</div>
</article>

What you get

Using Essence, you will mainly interact with Media objects. Media is a simple container for all the information that are fetched from an URL.

Here are the default properties it provides:

  • type
  • version
  • url
  • title
  • description
  • authorName
  • authorUrl
  • providerName
  • providerUrl
  • cacheAge
  • thumbnailUrl
  • thumbnailWidth
  • thumbnailHeight
  • html
  • width
  • height

These properties were gathered from the OEmbed and OpenGraph specifications, and merged together in a united interface. Based on such standards, these properties should be a solid starting point.

However, "non-standard" properties can and will also be setted.

Here is how you can manipulate the Media properties:

// through dedicated methods
if (!$Media->has('foo')) {
	$Media->set('foo', 'bar');
}

$value = $Media->get('foo');

// or directly like a class attribute
$Media->customValue = 12;

Note that Essence will always try to fill the html property when it is not available.

View on Github

5 - Embera:

An Oembed consumer library.

Embera is an Oembed consumer library written in PHP. It takes urls from a text and queries the matching service for information about the media and embeds the resulting html. It supports +150 sites, such as Youtube, Twitter, Livestream, Dailymotion, Instagram, Vimeo and many many more.

Installation

Install the latest stable version with:

$ composer require mpratt/embera:~2.0

Standalone Installation (without Composer)

Download the latest release or clone this repository and include the Àutoloader.php file inside the Embera/src directory.

require '....../Autoloader.php';

use Embera\Embera;

$embera = new Embera();

Requirements

  • PHP >= 7.0 (It should work on 5.6)
  • Curl or allow_url_fopen should be enabled

Basic Usage

The most common or basic example is this one:

use Embera\Embera;

$embera = new Embera();
echo $embera->autoEmbed('Hi! Have you seen this video? https://www.youtube.com/watch?v=J---aiyznGQ Its the best!');

The last example returns something like the following text:

Hi! Have you seen this video?
<iframe
  width="459"
  height="344"
  src="https://www.youtube.com/embed/J---aiyznGQ?feature=oembed"
  frameborder="0"
  allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture"
  allowfullscreen
></iframe>
Its the best!

You can also inspect urls for their oembed data:

use Embera\Embera;

$embera = new Embera();
print_r($embera->getUrlData([
    'https://vimeo.com/374131624',
    'https://www.flickr.com/photos/bees/8597283706/in/photostream',
]));

View on Github

6 - HTML to Markdown:

Converts HTML into Markdown.

Library which converts HTML to Markdown for your sanity and convenience.

Requires: PHP 7.2+

Lead Developer: @colinodell

Original Author: @nickcernis

Why convert HTML to Markdown?

"What alchemy is this?" you mutter. "I can see why you'd convert Markdown to HTML," you continue, already labouring the question somewhat, "but why go the other way?"

Typically you would convert HTML to Markdown if:

  1. You have an existing HTML document that needs to be edited by people with good taste.
  2. You want to store new content in HTML format but edit it as Markdown.
  3. You want to convert HTML email to plain text email.
  4. You know a guy who's been converting HTML to Markdown for years, and now he can speak Elvish. You'd quite like to be able to speak Elvish.
  5. You just really like Markdown.

How to use it

Require the library by issuing this command:

composer require league/html-to-markdown

Add require 'vendor/autoload.php'; to the top of your script.

Next, create a new HtmlConverter instance, passing in your valid HTML code to its convert() function:

use League\HTMLToMarkdown\HtmlConverter;

$converter = new HtmlConverter();

$html = "<h3>Quick, to the Batpoles!</h3>";
$markdown = $converter->convert($html);

The $markdown variable now contains the Markdown version of your HTML as a string:

echo $markdown; // ==> ### Quick, to the Batpoles!

The included demo directory contains an HTML->Markdown conversion form to try out.

Conversion options

By default, HTML To Markdown preserves HTML tags without Markdown equivalents, like <span> and <div>.

To strip HTML tags that don't have a Markdown equivalent while preserving the content inside them, set strip_tags to true, like this:

$converter = new HtmlConverter(array('strip_tags' => true));

$html = '<span>Turnips!</span>';
$markdown = $converter->convert($html); // $markdown now contains "Turnips!"

Or more explicitly, like this:

$converter = new HtmlConverter();
$converter->getConfig()->setOption('strip_tags', true);

$html = '<span>Turnips!</span>';
$markdown = $converter->convert($html); // $markdown now contains "Turnips!"

Note that only the tags themselves are stripped, not the content they hold.

To strip tags and their content, pass a space-separated list of tags in remove_nodes, like this:

$converter = new HtmlConverter(array('remove_nodes' => 'span div'));

$html = '<span>Turnips!</span><div>Monkeys!</div>';
$markdown = $converter->convert($html); // $markdown now contains ""

By default, all comments are stripped from the content. To preserve them, use the preserve_comments option, like this:

$converter = new HtmlConverter(array('preserve_comments' => true));

$html = '<span>Turnips!</span><!-- Monkeys! -->';
$markdown = $converter->convert($html); // $markdown now contains "Turnips!<!-- Monkeys! -->"

To preserve only specific comments, set preserve_comments with an array of strings, like this:

$converter = new HtmlConverter(array('preserve_comments' => array('Eggs!')));

$html = '<span>Turnips!</span><!-- Monkeys! --><!-- Eggs! -->';
$markdown = $converter->convert($html); // $markdown now contains "Turnips!<!-- Eggs! -->"

By default, placeholder links are preserved. To strip the placeholder links, use the strip_placeholder_links option, like this:

$converter = new HtmlConverter(array('strip_placeholder_links' => true));

$html = '<a>Github</a>';
$markdown = $converter->convert($html); // $markdown now contains "Github"

View on Github

7 - HTML5 PHP:

An HTML5 parser and serializer library.

HTML5 is a standards-compliant HTML5 parser and writer written entirely in PHP. It is stable and used in many production websites, and has well over five million downloads.

HTML5 provides the following features.

  • An HTML5 serializer
  • Support for PHP namespaces
  • Composer support
  • Event-based (SAX-like) parser
  • A DOM tree builder
  • Interoperability with QueryPath
  • Runs on PHP 5.3.0 or newer

Installation

Install HTML5-PHP using composer.

By adding the masterminds/html5 dependency to your composer.json file:

{
  "require" : {
    "masterminds/html5": "^2.0"
  },
}

By invoking require command via composer executable:

composer require masterminds/html5

Basic Usage

HTML5-PHP has a high-level API and a low-level API.

Here is how you use the high-level HTML5 library API:

<?php
// Assuming you installed from Composer:
require "vendor/autoload.php";

use Masterminds\HTML5;

// An example HTML document:
$html = <<< 'HERE'
  <html>
  <head>
    <title>TEST</title>
  </head>
  <body id='foo'>
    <h1>Hello World</h1>
    <p>This is a test of the HTML5 parser.</p>
  </body>
  </html>
HERE;

// Parse the document. $dom is a DOMDocument.
$html5 = new HTML5();
$dom = $html5->loadHTML($html);

// Render it as HTML5:
print $html5->saveHTML($dom);

// Or save it to a file:
$html5->save($dom, 'out.html');

The $dom created by the parser is a full DOMDocument object. And the save() and saveHTML() methods will take any DOMDocument.

View on Github

8 - Parsedown:

Another Markdown parser.

Features

Installation

Install the composer package:

composer require erusev/parsedown

Or download the latest release and include Parsedown.php

Example

$Parsedown = new Parsedown();

echo $Parsedown->text('Hello _Parsedown_!'); # prints: <p>Hello <em>Parsedown</em>!</p>

You can also parse inline markdown only:

echo $Parsedown->line('Hello _Parsedown_!'); # prints: Hello <em>Parsedown</em>!

More examples in the wiki and in this video tutorial.

Security

Parsedown is capable of escaping user-input within the HTML that it generates. Additionally Parsedown will apply sanitisation to additional scripting vectors (such as scripting link destinations) that are introduced by the markdown syntax itself.

To tell Parsedown that it is processing untrusted user-input, use the following:

$Parsedown->setSafeMode(true);

If instead, you wish to allow HTML within untrusted user-input, but still want output to be free from XSS it is recommended that you make use of a HTML sanitiser that allows HTML tags to be whitelisted, like HTML Purifier.

In both cases you should strongly consider employing defence-in-depth measures, like deploying a Content-Security-Policy (a browser security feature) so that your page is likely to be safe even if an attacker finds a vulnerability in one of the first lines of defence above.

Security of Parsedown Extensions

Safe mode does not necessarily yield safe results when using extensions to Parsedown. Extensions should be evaluated on their own to determine their specific safety against XSS.

Escaping HTML

WARNING: This method isn't safe from XSS!

If you wish to escape HTML in trusted input, you can use the following:

$Parsedown->setMarkupEscaped(true);

Beware that this still allows users to insert unsafe scripting vectors, such as links like [xss](javascript:alert%281%29).

View on Github

9 - PHP CSS Parser:

A Parser for CSS Files written in PHP. Allows extraction of CSS files into a data structure, manipulation of said structure and output as (optimized) CSS.

Usage

Installation using Composer

composer require sabberworm/php-css-parser

Extraction

To use the CSS Parser, create a new instance. The constructor takes the following form:

new \Sabberworm\CSS\Parser($css);

To read a file, for example, you’d do the following:

$parser = new \Sabberworm\CSS\Parser(file_get_contents('somefile.css'));
$cssDocument = $parser->parse();

The resulting CSS document structure can be manipulated prior to being output.

Options

Charset

The charset option will only be used if the CSS file does not contain an @charset declaration. UTF-8 is the default, so you won’t have to create a settings object at all if you don’t intend to change that.

$settings = \Sabberworm\CSS\Settings::create()
    ->withDefaultCharset('windows-1252');
$parser = new \Sabberworm\CSS\Parser($css, $settings);

Strict parsing

To have the parser throw an exception when encountering invalid/unknown constructs (as opposed to trying to ignore them and carry on parsing), supply a thusly configured \Sabberworm\CSS\Settings object:

$parser = new \Sabberworm\CSS\Parser(
    file_get_contents('somefile.css'),
    \Sabberworm\CSS\Settings::create()->beStrict()
);

Note that this will also disable a workaround for parsing the unquoted variant of the legacy IE-specific filter rule.

Disable multibyte functions

To achieve faster parsing, you can choose to have PHP-CSS-Parser use regular string functions instead of mb_* functions. This should work fine in most cases, even for UTF-8 files, as all the multibyte characters are in string literals. Still it’s not recommended using this with input you have no control over as it’s not thoroughly covered by test cases.

$settings = \Sabberworm\CSS\Settings::create()->withMultibyteSupport(false);
$parser = new \Sabberworm\CSS\Parser($css, $settings);

View on Github

10 - PHP Markdown:

A Markdown parser.

Introduction

This is a library package that includes the PHP Markdown parser and its sibling PHP Markdown Extra with additional features.

Markdown is a text-to-HTML conversion tool for web writers. Markdown allows you to write using an easy-to-read, easy-to-write plain text format, then convert it to structurally valid XHTML (or HTML).

"Markdown" is actually two things: a plain text markup syntax, and a software tool, originally written in Perl, that converts the plain text markup to HTML. PHP Markdown is a port to PHP of the original Markdown program by John Gruber.

Requirement

This library package requires PHP 7.4 or later.

Note: The older plugin/library hybrid package for PHP Markdown and PHP Markdown Extra is no longer maintained but will work with PHP 4.0.5 and later.

You might need to set pcre.backtrack_limit higher than 1 000 000 (the default), though the default is usually fine.

Usage

To use this library with Composer, first install it with:

$ composer require michelf/php-markdown

Then include Composer's generated vendor/autoload.php to enable autoloading:

require 'vendor/autoload.php';

Without Composer, for autoloading to work, your project needs an autoloader compatible with PSR-4 or PSR-0. See the included Readme.php file for a minimal autoloader setup. (If you cannot use autoloading, see below.)

With class autoloading in place:

use Michelf\Markdown;
$my_html = Markdown::defaultTransform($my_text);

Markdown Extra syntax is also available the same way:

use Michelf\MarkdownExtra;
$my_html = MarkdownExtra::defaultTransform($my_text);

If you wish to use PHP Markdown with another text filter function built to parse HTML, you should filter the text after the transform function call. This is an example with PHP SmartyPants:

use Michelf\Markdown, Michelf\SmartyPants;
$my_html = Markdown::defaultTransform($my_text);
$my_html = SmartyPants::defaultTransform($my_html);

All these examples are using the static defaultTransform static function found inside the parser class. If you want to customize the parser configuration, you can also instantiate it directly and change some configuration variables:

use Michelf\MarkdownExtra;
$parser = new MarkdownExtra;
$parser->fn_id_prefix = "post22-";
$my_html = $parser->transform($my_text);

To learn more, see the full list of configuration variables.

Usage without an autoloader

If you cannot use class autoloading, you can still use include or require to access the parser. To load the Michelf\Markdown parser, do it this way:

require_once 'Michelf/Markdown.inc.php';

Or, if you need the Michelf\MarkdownExtra parser:

require_once 'Michelf/MarkdownExtra.inc.php';

While the plain .php files depend on autoloading to work correctly, using the .inc.php files instead will eagerly load the dependencies that would be loaded on demand if you were using autoloading.

View on Github

Thank you for following this article.

Related videos:

The Most Popular PHP Frameworks to Use in 2022

#php #css #format 

10 Popular PHP Libraries for Working with Markup and CSS Formats
Reid  Rohan

Reid Rohan

1662971364

Top 8 Libraries for Specific Formats Processing in JavaScript

In today's post we will learn about Top 8 Libraries for Specific Formats Processing in JavaScript. 

What is parsing a text?

Parsing is a grammatical exercise that involves breaking down a text into its component parts of speech with an explanation of the form, function, and syntactic relationship of each part so that the text can be understood. The term "parsing" comes from the Latin pars for "part (of speech)."

Table of contents:

  • jBinary - High-level I/O (loading, parsing, manipulating, serializing, saving) for binary files with declarative syntax for describing file types and data structures.
  • BabyParse - Fast and reliable CSV parser based on Papa Parse. Papa Parse is for the browser, Baby Parse is for Node.js.
  • CSV - A simple, blazing-fast CSV parser and encoder. Full RFC 4180 compliance.
  • json3 - A modern JSON implementation compatible with nearly all JavaScript platforms.
  • exif-js - JavaScript library for reading EXIF image metadata.
  • Parse-css - Standards-based CSS Parser.
  • Parser-lib - Collection of parsers written in JavaScript.
  • Parse-torrent - Parse a torrent identifier (magnet uri, .torrent file, info hash).

1 - jBinary: High-level I/O (loading, parsing, manipulating, serializing, saving) for binary files with declarative syntax for describing file types and data structures.

How can I use it?

Typical scenario:

  • Describe typeset with JavaScript-compatible declarative syntax (jBinary will do type caching for you).
  • Create jBinary instance from memory or from data source and your typeset.
  • Read/write data just as native JavaScript objects!

API Documentation.

Check out wiki for detailed API documentation.

Is there any example code?

How about TAR archive modification:

// configuring paths for Require.js
// (you can use CommonJS (Component, Node.js) or simple script tags as well)
require.config({
  paths: {
    jdataview: 'https://jdataview.github.io/dist/jdataview',
    jbinary: 'https://unpkg.com/jbinary@2.1.3/dist/browser/jbinary',
    TAR: 'https://jdataview.github.io/jBinary.Repo/typeSets/tar' // TAR archive typeset
  }
});

require(['jbinary', 'TAR'], function (jBinary, TAR) {
  // loading TAR archive with given typeset
  jBinary.load('sample.tar', TAR).then(function (jb/* : jBinary */) {
    // read everything using type aliased in TAR['jBinary.all']
    var files = jb.readAll();

    // do something with files in TAR archive (like rename them to upper case)
    files.forEach(function (file) {
      file.name = file.name.toUpperCase();
    });

    jb.writeAll(files, 0); // writing entire content from files array
    jb.saveAs('sample.new.tar'); // saving file under given name
  });
});

Run or edit it on JSBin.

View on Github

2 - BabyParse: Fast and reliable CSV parser based on Papa Parse. Papa Parse is for the browser, Baby Parse is for Node.js.

Installation

// simply install using npm
npm install babyparse --save

Basic Usage

// pass in the contents of a csv file
parsed = Baby.parse(csv);

// voila
rows = parsed.data;

Parse File(s)

Baby Parse will assume the input is a filename if it ends in .csv or .txt.

// Parse single file
parsed = Baby.parseFiles(file[, config])

rows = parsed.data
// Parse multiple files
// Files can be either an array of strings or objects { file: filename[, config: config] }
// When using and array of objects and you include a config it will be used in place of the global config
parsed = Baby.parseFiles(files[, globalConfig])

rows = parsed[index].data

For a complete understanding of the power of this library, please refer to the Papa Parse web site.

View on Github

3 - CSV: A simple, blazing-fast CSV parser and encoder. Full RFC 4180 compliance.

Installation

MASTER is currently under development. As such, csv.src.js and csv.js are both unusable. Make sure you download csv.min.js.

Download csv.min.js and reference to it using your preferred method.

If you use Bower, or npm, install the comma-separated-values package.

Instantiation

Create a CSV instance with var csv = new CSV(data);, where data is a plain-text CSV string. You can supply options with the format var csv = new CSV(data, { option: value });.

Options

  • cast: true to automatically cast numbers and booleans to their JavaScript equivalents. false otherwise. Supply your own array to override autocasting. Defaults to true.
  • lineDelimiter: The string that separates lines from one another. If parsing, defaults to autodetection. If encoding, defaults to '\r\n'.
  • cellDelimiter: A 1-character-long string that separates values from one another. If parsing, defaults to autodetection. If encoding, defaults to ','.
  • header: true if the first row of the CSV contains header values, or supply your own array. Defaults to false.

You can update an option's value any time after instantiation with csv.set(option, value).

Quickstart

For those accustomed to JavaScript, the CSV.js API:

// The instance will set itself up for parsing or encoding on instantiation,
// which means that each instance can only either parse or encode.
// The `options` object is optional
var csv = new CSV(data, [options]);

// If the data you've supplied is an array,
// CSV#encode will return the encoded CSV.
// It will otherwise fail silently.
var encoded = csv.encode();

// If the data you've supplied is a string,
// CSV#parse will return the parsed CSV.
// It will otherwise fail silently.
var parsed = csv.parse();

// The CSV instance can return the record immediately after
// it's been encoded or parsed to prevent storing the results
// in a large array by calling CSV#forEach and passing in a function.
csv.forEach(function(record) {
  // do something with the record
});

// CSV includes some convenience class methods:
CSV.parse(data, options); // identical to `new CSV(data, options).parse()`
CSV.encode(data, options); // identical to `new CSV(data, options).encode()`
CSV.forEach(data, options, callback); // identical to `new CSV(data, options).forEach(callback)`

// For overriding automatic casting, set `options.cast` to an array.
// For `parsing`, valid array values are: 'Number', 'Boolean', and 'String'.
CSV.parse(data, { cast: ['String', 'Number', 'Number', 'Boolean'] });
// For `encoding`, valid array values are 'Array', 'Object', 'String', 'Null', and 'Primitive'.
CSV.encode(data, { cast: ['Primitive', 'Primitive', 'String'] });

View on Github

4 - json3: A modern JSON implementation compatible with nearly all JavaScript platforms.

Date Serialization

JSON 3 deviates from the specification in one important way: it does not define Date#toISOString() or Date#toJSON(). This preserves CommonJS compatibility and avoids polluting native prototypes. Instead, date serialization is performed internally by the stringify() implementation: if a date object does not define a custom toJSON() method, it is serialized as a simplified ISO 8601 date-time string.

Several native Date#toJSON() implementations produce date time strings that do not conform to the grammar outlined in the spec. In these environments, JSON 3 will override the native stringify() implementation. There is an issue on file to make these tests less strict.

Portions of the date serialization code are adapted from the date-shim project.

Usage

Web Browsers

<script src="//cdnjs.cloudflare.com/ajax/libs/json3/3.3.2/json3.min.js"></script>
<script>
  JSON.stringify({"Hello": 123});
  // => '{"Hello":123}'
  JSON.parse("[[1, 2, 3], 1, 2, 3, 4]", function (key, value) {
    if (typeof value == "number") {
      value = value % 2 ? "Odd" : "Even";
    }
    return value;
  });
  // => [["Odd", "Even", "Odd"], "Odd", "Even", "Odd", "Even"]
</script>

When used in a web browser, JSON 3 exposes an additional JSON3 object containing the noConflict() and runInContext() functions, as well as aliases to the stringify() and parse() functions.

noConflict and runInContext

  • JSON3.noConflict() restores the original value of the global JSON object and returns a reference to the JSON3 object.
  • JSON3.runInContext([context, exports]) initializes JSON 3 using the given context object (e.g., window, global, etc.), or the global object if omitted. If an exports object is specified, the stringify(), parse(), and runInContext() functions will be attached to it instead of a new object.

View on Github

5 - Exif-js: JavaScript library for reading EXIF image metadata.

Install

Install exif-js through NPM:

npm install exif-js --save    

Or Bower:

bower install exif-js --save

Then add a script tag in your an HTML in the best position referencing your local file.

<script src="vendors/exif-js/exif-js"></script>

You can also use a minified version hosted on jsDelivr

<script src="https://cdn.jsdelivr.net/npm/exif-js"></script>

Usage

The package adds a global EXIF variable (or AMD or CommonJS equivalent).

Start with calling the EXIF.getData function. You pass it an image as a parameter:

  • either an image from a <img src="image.jpg">
  • OR a user selected image in a <input type="file"> element on your page.

As a second parameter you specify a callback function. In the callback function you should use this to access the image with the aforementioned metadata you can then use as you want. That image now has an extra exifdata property which is a Javascript object with the EXIF metadata. You can access it's properties to get data like the image caption, the date a photo was taken or it's orientation.

You can get all tages with EXIF.getTag. Or get a single tag with EXIF.getTag, where you specify the tag as the second parameter. The tag names to use are listed in EXIF.Tags in exif.js.

Important: Note that you have to wait for the image to be completely loaded, before calling getData or any other function. It will silently fail otherwise. You can implement this wait, by running your exif-extracting logic on the window.onLoad function. Or on an image's own onLoad function. For jQuery users please note that you can NOT (reliably) use jQuery's ready event for this. Because it fires before images are loaded. You could use (window).load()insteadof(document.ready() (please note that `exif-js has NO dependency on jQuery or any other external library).

View on Github

6 - Parse-css: Standards-based CSS Parser.

Using the Library

Include parse-css.js in your page. Then just call the desired parsing function, named after the algorithms in the spec: parseAStylesheet(), etc. You can pass a string or a list of tokens (such as what's produced by the tokenize() function). It'll return an appropriate object, as specified by the parsing function.

If you want to get access to the tokens directly, call tokenize() with a string; it'll return a list of tokens.

Note that the Syntax spec, and thus this parser, is extremely generic. It doesn't have any specific knowledge of CSS rules, just the core syntax, so it won't throw out invalid or unknown things, and it can't even actually parse the contents of blocks (because it doesn't know if they'll contain rules or declarations, and those are ambiguous without any context). I plan to add some functions that add more CSS knowledge (in an extensible way, so it'll retain anything custom that you want to handle yourself), but for now you have to do all the verification and additional parsing yourself.

Parsing Functions

Here's the full list of parsing functions. They do exactly what they say in their name, because they're named exactly the same as the corresponding section of the Syntax spec:

  • parseAStylesheet()
  • parseAListOfRules()
  • parseARule()
  • parseADeclaration()
  • parseAListOfDeclarations()
  • parseAComponentValue()
  • parseAListOfComponentValues()
  • parseACommaSeparatedListOfComponentValues()

Canonicalizing Against A Grammar

By default, the parser can only do so much; it knows how to interpret the top-level rules in a stylesheet, but not how to interpret the contents of anything below that. This means that anything nested within a top-level block is left as a bare token stream, requiring you to call the correct parsing function on it.

The canonicalize() function takes a parsing result and a grammar and transforms the result accordingly, rendering the result into an easier-to-digest form.

A grammar is an object with one of the following four forms:

{"stylesheet":true}
{
	"qualified": <grammar>,
	"@foo": <grammar>,
	"unknown": <function>
}
{
	"declarations": true,
	"@foo": <grammar>
	"unknown": <function>
}
null

View on Github

7 - Parser-lib: Collection of parsers written in JavaScript.

Introduction

The ParserLib CSS parser is a CSS3 SAX-inspired parser written in JavaScript. It handles standard CSS syntax as well as validation (checking of property names and values) although it is not guaranteed to thoroughly validate all possible CSS properties.

Adding to your project

The CSS parser is built for a number of different JavaScript environments. The most recently released version of the parser can be found in the dist directory when you check out the repository; run npm run build to regenerate them from the latest sources.

Node.js

You can use the CSS parser in a Node.js script via the standard npm package manager as the parserlib package (npm install parserlib):

var parserlib = require("parserlib");

var parser = new parserlib.css.Parser();

Alternatively, you can copy a single file version of the parser from dist/node-parserlib.js to your own project, and use it as follows:

var parserlib = require("./node-parserlib");

Rhino

To use the CSS parser in a Rhino script, copy the file dist/parserlib.js to your project and then include it at the beginning:

load("parserlib.js");

View on Github

8 - Parse-torrent: Parse a torrent identifier (magnet uri, .torrent file, info hash).

install

npm install parse-torrent

usage

parse

The return value of parseTorrent will contain as much info as possible about the torrent. The only property that is guaranteed to be present is infoHash.

const parseTorrent = require('parse-torrent')
const fs = require('fs')

// info hash (as a hex string)
parseTorrent('d2474e86c95b19b8bcfdb92bc12c9d44667cfa36')
// { infoHash: 'd2474e86c95b19b8bcfdb92bc12c9d44667cfa36' }

// info hash (as a Buffer)
parseTorrent(new Buffer('d2474e86c95b19b8bcfdb92bc12c9d44667cfa36', 'hex'))
// { infoHash: 'd2474e86c95b19b8bcfdb92bc12c9d44667cfa36' }

// magnet uri (as a utf8 string)
parseTorrent('magnet:?xt=urn:btih:d2474e86c95b19b8bcfdb92bc12c9d44667cfa36')
// { xt: 'urn:btih:d2474e86c95b19b8bcfdb92bc12c9d44667cfa36',
//   infoHash: 'd2474e86c95b19b8bcfdb92bc12c9d44667cfa36' }

// magnet uri with torrent name
parseTorrent('magnet:?xt=urn:btih:d2474e86c95b19b8bcfdb92bc12c9d44667cfa36&dn=Leaves%20of%20Grass%20by%20Walt%20Whitman.epub')
// { xt: 'urn:btih:d2474e86c95b19b8bcfdb92bc12c9d44667cfa36',
//   dn: 'Leaves of Grass by Walt Whitman.epub',
//   infoHash: 'd2474e86c95b19b8bcfdb92bc12c9d44667cfa36',
//   name: 'Leaves of Grass by Walt Whitman.epub' }

// magnet uri with trackers
parseTorrent('magnet:?xt=urn:btih:d2474e86c95b19b8bcfdb92bc12c9d44667cfa36&tr=http%3A%2F%2Ftracker.example.com%2Fannounce')
// { xt: 'urn:btih:d2474e86c95b19b8bcfdb92bc12c9d44667cfa36',
//   tr: 'http://tracker.example.com/announce',
//   infoHash: 'd2474e86c95b19b8bcfdb92bc12c9d44667cfa36',
//   announce: [ 'http://tracker.example.com/announce' ] }

// .torrent file (as a Buffer)
parseTorrent(fs.readFileSync(__dirname + '/torrents/leaves.torrent'))
// { info:
//    { length: 362017,
//      name: <Buffer 4c 65 61 76 65 73 20 6f 66 20 47 72 61 73 73 20 62 79 20 57 61 6c 74 20 57 68 69 74 6d 61 6e 2e 65 70 75 62>,
//      'piece length': 16384,
//      pieces: <Buffer 1f 9c 3f 59 be ec 07 97 15 ec 53 32 4b de 85 69 e4 a0 b4 eb ec 42 30 7d 4c e5 55 7b 5d 39 64 c5 ef 55 d3 54 cf 4a 6e cc 7b f1 bc af 79 d1 1f a5 e0 be 06 ...> },
//   infoBuffer: <Buffer 64 36 3a 6c 65 6e 67 74 68 69 33 36 32 30 31 37 65 34 3a 6e 61 6d 65 33 36 3a 4c 65 61 76 65 73 20 6f 66 20 47 72 61 73 73 20 62 79 20 57 61 6c 74 20 57 ...>,
//   infoHash: 'd2474e86c95b19b8bcfdb92bc12c9d44667cfa36',
//   name: 'Leaves of Grass by Walt Whitman.epub',
//   private: false,
//   created: Thu Aug 01 2013 06:27:46 GMT-0700 (PDT),
//   comment: 'Downloaded from http://TheTorrent.org',
//   announce:
//    [ 'http://tracker.example.com/announce' ],
//   urlList: [],
//   files:
//    [ { path: 'Leaves of Grass by Walt Whitman.epub',
//        name: 'Leaves of Grass by Walt Whitman.epub',
//        length: 362017,
//        offset: 0 } ],
//   length: 362017,
//   pieceLength: 16384,
//   lastPieceLength: 1569,
//   pieces:
//    [ '1f9c3f59beec079715ec53324bde8569e4a0b4eb',
//      'ec42307d4ce5557b5d3964c5ef55d354cf4a6ecc',
//      '7bf1bcaf79d11fa5e0be06593c8faafc0c2ba2cf',
//      '76d71c5b01526b23007f9e9929beafc5151e6511',
//      '0931a1b44c21bf1e68b9138f90495e690dbc55f5',
//      '72e4c2944cbacf26e6b3ae8a7229d88aafa05f61',
//      'eaae6abf3f07cb6db9677cc6aded4dd3985e4586',
//      '27567fa7639f065f71b18954304aca6366729e0b',
//      '4773d77ae80caa96a524804dfe4b9bd3deaef999',
//      'c9dd51027467519d5eb2561ae2cc01467de5f643',
//      '0a60bcba24797692efa8770d23df0a830d91cb35',
//      'b3407a88baa0590dc8c9aa6a120f274367dcd867',
//      'e88e8338c572a06e3c801b29f519df532b3e76f6',
//      '70cf6aee53107f3d39378483f69cf80fa568b1ea',
//      'c53b506159e988d8bc16922d125d77d803d652c3',
//      'ca3070c16eed9172ab506d20e522ea3f1ab674b3',
//      'f923d76fe8f44ff32e372c3b376564c6fb5f0dbe',
//      '52164f03629fd1322636babb2c014b7dae582da4',
//      '1363965261e6ce12b43701f0a8c9ed1520a70eba',
//      '004400a267765f6d3dd5c7beb5bd3c75f3df2a54',
//      '560a61801147fa4ec7cf568e703acb04e5610a4d',
//      '56dcc242d03293e9446cf5e457d8eb3d9588fd90',
//      'c698de9b0dad92980906c026d8c1408fa08fe4ec' ] }

View on Github

Thank you for following this article.

#javascript #format #text 

Top 8 Libraries for Specific Formats Processing in JavaScript

AutoFormat.jl: Basic autoformat tool for Julialang

AutoFormat.jl

Basic autoformat tool for Julialang

Installing Unregistered Packages

 Pkg.clone("git://github.com/yulijia/AutoFormat.jl.git")

Example

using AutoFormat
format("/home/yu/messy_code.jl","/home/yu/messy_code_format.jl",2)
    # usage : format_demo(input_file,output_file,tab_width)
    # download a messy code file example at https://gist.github.com/yulijia/9391666

Todo

  • Learning abstract syntax tree
  • Fix bugs
    • can not format one line block
    • matrix alignment
    • wrong comments style
  • Other features
    • indent with tabs
    • print result with STDOUT
    • print comments or not
    • display diffs instead of rewriting files

Notice

As Stefan mentioned the right way to do this is to enhance the printing of Julia ASTs to the point where the printed form of an expression object is the properly formatted version.

Read list:

Download Details:

Author: Yulijia
Source Code: https://github.com/yulijia/AutoFormat.jl 
License: View license

#julia #format 

AutoFormat.jl: Basic autoformat tool for Julialang

A Julia Reader for The Harwell-Boeing and Rutherford-Boeing Formats

A Julia Reader for the Harwell-Boeing and Rutherford-Boeing Formats  

Installing

julia> Pkg.add("HarwellRutherfordBoeing")
julia> Pkg.test("HarwellRutherfordBoeing")

Obtaining the Harwell-Boeing Collection

Retrieve the systems from

ftp://ftp.cerfacs.fr/pub/algo/matrices/harwell_boeing

Build hsplit.c using cc -o hsplit hsplit.c. This tool may be used to split a data file into its constituents. This module will only read one set per file.

Obtaining Matrices and Supplementary Data in Rutherford-Boeing Format

The best source is the University of Florida Sparse Matrix Collection.

Example

julia> using HarwellRutherfordBoeing
julia> M = HarwellBoeingMatrix("well1850.rra")
Harwell-Boeing matrix WELL1850 of type RRA
1850 rows, 712 cols, 8758 nonzeros
1 right-hand sides, 0 guesses, 0 solutions

julia > M.matrix
1850x712 sparse matrix with 8758 Float64 entries:
[1   ,    1]  =  0.027735
[3   ,    1]  =  0.027735  # etc...

julia> M.rhs'
1x1850 Array{Float64,2}:
6.40676  0.58834  6.40279  0.595772  …  -3.30846  -2.91383  -2.91705

julia> rb = RutherfordBoeingData("aa3.rb")
Rutherford-Boeing data 1681 of type pra
825 rows, 8627 cols, 70806 nonzeros

julia> using PyPlot

julia> spy(rb.data, markersize=2)

Download Details:

Author: JuliaSparse
Source Code: https://github.com/JuliaSparse/HarwellRutherfordBoeing.jl 
License: MIT license

#julia #format 

A Julia Reader for The Harwell-Boeing and Rutherford-Boeing Formats

Support for OI-FITS (optical interferometry Data Format) In Julia

OIFITS.jl

The OIFITS.jl package provides support for OI-FITS data in Julia language.

OI-FITS types

OI-FITS is a standard to store optical interferometry data as a collection of data-blocks. In the second revision of the standard (see Ref. 1 and Ref. 2), an OI-FITS file may contain the following data-blocks:

  • an OI_TARGET data-block stores a list of observed targets;
  • each OI_ARRAY data-block describes a given array of telescope stations;
  • each OI_WAVELENGTH data-block describes a given instrument notably the effective wavelengths and bandwidths of its spectral channels;
  • OI_CORR data-blocks store correlation data;
  • OI_VIS data-blocks store complex visibility dat;
  • OI_VIS2 data-blocks store squared visibility (powerspectrum) data;
  • OI_T3 data-blocks store triple product (bispectrum) data;
  • OI_FLUX data-blocks store spectral flux data;
  • OI_INSPOL data-blocks store instrumental polarization data.

These data-blocks, are stored as binary tables in a FITS data file. The support for FITS files is provided by the FITSIO.jl package.

The julia type of an OI-FITS data-block is named as the corresponding OI-FITS extension. In addition to these types for individual OI-FITS data-blocks, the OIFITS.jl package provides data-sets (of type OIDataSet) that contain several OI-FITS data-blocks. Each data-set is an efficient representation of the contents of a compliant OI-FITS file.

Reading and writing OI-FITS files

Reading and writing OI-FITS data-sets

Reading an OI-FITS data file in Julia yields a data-set and is done by:

using OIFITS
ds = read(OIDataSet, input)

where input it the name of the OI-FITS file or an instance of FITSIO.FITS which represents an open FITS file. The above read call is equivalent to the shortcut:

ds = OIDataSet(input)

It is possible to merge the contents of several OI-FITS file, say inp1, inp2, etc., by one of:

ds = read(OIDataSet, inp1, inp2, ...)
ds = OIDataSet(inp1, inp2, ...)

or to merge them into an existing data-set ds:

read!(ds, inp1, inp2, ...)

Creating an OI-FITS file is as simple as writing the data-set ds:

write(filename, ds)

Overwriting is forbidden by default, but the keyword overwrite=true may be specified to allow for silently overwriting an existing file.

Reading individual OI-FITS data-blocks

It may be useful to read individual OI-FITS data-blocks, to debug or to fix the contents of a non-compliant OI-FITS file. To that end, you must open the FITS file and can then read a given HDU as an OI-FITS data-block:

using FITSIO, OIFITS
f = FITS(filename, "r")     # open FITS file for reading
tgt = OI_TARGET(f[i])       # read OI_TARGET extension in i-th HDU
tgt = read(OI_TARGET, f[i]) # idem
db = OI_VIS2(f[j])          # read OI_VIS2 extension in j-th HDU
db = read(OI_VIS2, f[j])    # idem
...

any OI-FITS data-block type can be used in that way. If the type of the i-th extension is not known, OIDataBlock can be used instead but the result is not type-stable:

db = OIDataBlock(f[i])       # read OI-FITS extension extension in i-th HDU
db = read(OIDataBlock, f[i]) # idem

Writing individual OI-FITS data-blocks is also possible:

using FITSIO, OIFITS
f = FITS(filename, "w") # open FITS file for writing
write(f, db)            # write db in the next HDU of f

To fix a non-compliant OI-FITS file (usually dupplicate target or instarument names), you can read all the data-blocks, fix those which are wrong and push them in order in an OIDataSet to have a consistent data-set which you can then directly use or write in an OI-FITS file for later. Thanks to the automatic rewriting of target identifiers and of the fact that targets (and other dependencies) are identified by their name and consistently merged, it is possible to push an OI_TARGET with multiply defined identical targets (apart maybe their identifiers).

Accessing the contents of data-blocks and data-sets

The contents of OI-FITS data-blocks and data-sets may be accessed by the dot notation but also by indexation.

Contents of data-sets

The dot notation can be used on a data-set object, say ds, storing a consistent set of OI-FITS data-blocks. The following properties are available:

ds.target is the OI_TARGET data-block of the OI-FITS structure.

ds.instr is a list of OI_WAVELENGTH data-blocks indexed by a regular integer index or by the instrument name:

ds.instr[i]       # yields the i-th OI_WAVELENGTH data-block
ds.instr[insname] # yields the OI_WAVELENGTH data-block whose name matches insname

Matching of names follows FITS conventions that case of letters and trailing spaces are ignored. An exception is thrown if the index (integer or name) is not valid. The get method can be used to provide a default value, for example:

get(ds.instr, insname, nothing)

would yield nothing if insname is not found in ds.instr instead of throwing an exception.

ds.array is a list of OI_ARRAY data-blocks indexed like ds.instr except that interferometric array names are assumed.

ds.correl is a list of OI_CORR data-blocks indexed like ds.instr except that correlation data array names are assumed.

ds.vis is a vector of OI_VIS data-blocks.

ds.vis2 is a vector of OI_VIS2 data-blocks.

ds.t3 is a vector of OI_T3 data-blocks.

ds.flux is a vector of OI_FLUX data-blocks.

ds.inspol is a vector of OI_INSPOL data-blocks.

Other fields of data-sets shall be considered as private and not accessed directly.

Using the dot notation, it is easy to access to the different data-blocks containing measurements. For instance:

for db in ds.vis2
    ...
end

is convenient to loop across all OI_VIS2 instances stored by ds.

Contents of data-blocks

The contents of a data-block, say db, may also be accessed by the dot notation. As a general rule, db.key or db.col yield the value of the keyword key or the contents of the column col of the OI-FITS table corresponding to the data-block db. In order to follow Julia conventions and to accommodate for a number of restrictions, key or col are the FITS keyword or column name converted to lower case letters and with non-alphanumeric letters replaced by underscores. For instance db.date_obs yields the value of the keyword DATE-OBS, that is the UTC start date of observations. The revision number corresponding to the keyword OI_REVN is however accessed as db.revn, this is the only exception. Other properties are also accessible via this syntax:

db.extname yields the OI-FITS name of the extension corresponding to the data-block db (for all data-block types);

db.array yields the OI_ARRAY data-block associated with data-block db (only for OI_VIS, OI_VIS2, OI_T3, OI_FLUX, and OI_INSPOL data-block). Beware that the association with an OI_ARRAY is optional, so db.array may be actually undefined; this can be checked by isdefined(db,:array).

db.instr yields the OI_WAVELENGTH data-block associated with data-block db (only for OI_VIS, OI_VIS2, OI_T3, and OI_FLUX data-block).

db.correl yields the OI_CORR data-block associated with data-block db (only for OI_VIS, OI_VIS2, OI_T3, and OI_FLUX data-block).

db.name is an alias for db.arrname for OI_ARRAY instances, for db.insname for OI_WAVELENGTH instances, and for db.corrname for OI_CORR instances.

Of course, getting a given property must make sense. For example, db.sta_name is only possible for an OI_ARRAY data-block but not for an OI_WAVELENGTH data-block. The dot notation can be however be chained and:

db.instr.eff_wave

can be used to access the effective wavelengths of the measurements in db via the instrument associated to db. Shortcuts are provided:

λ  = db.eff_wave # get effective wavelength
Δλ = db.eff_band # get effective bandwidth

for OI_WAVELENGTH data-blocks but also for OI_VIS, OI_VIS2, OI_T3, and OI_FLUX data-blocks.

Some fields of a data-block db may however be undefined because:

the field is not yet defined (the data-block is being constructed);

the field is optional in the revision db.revn of the data-block;

the field (for example db.instr for an OI_VIS data-block) involves links with other data-blocks (the dependencies) and these links are only defined when a data-block is part of a data-set (see Building of data-sets below).

OI_TARGET data-blocks

For efficiency, instances of OI_TARGET data-blocks do not follow the same rules as other types of OI-FITS data-blocks whose properties are the columns of the corresponding OI-FITS table: in an OI_TARGET instance, all parameters describing a target are repesented by an OITargetEntry structure and all targets are stored as a vector of OITargetEntry. An OI_TARGET instance, say db, has the 3 following properties:

db.extname # yields "OI_TARGET"
db.list    # yields a vector of OITargetEntry instances
db.revn    # yields the revision number

The list of targets db.list can be indexed by an integer (as any Julia vector) or by the target name (case of letters and trailing spaces are irrelevant).

As an OI_TARGET data-blocks is essentially a vector of target entries, it can be used as an iterable and it can indexed by an integer index or by a target name:

length(db) # the number of targets, shortcut for `length(db.list)`
db[i]      # the i-th target, shortcut for `db.list[i]`
db[key]    # the target whose name matches string `key`, shortcut for `db.list[key]`

Standard methods get and haskey, applied to db.list or directly to db, work as expected and according to the type (integer or string) of the key. For the keys method, the default is to return an iterator over the target names, but the type of the expected keys can be specified:

get(db,key,def)   # yields `db[key]` or `def` if `key` not found
keys(db)          # iterator over target names
keys(String, db)  # idem
keys(Integer, db) # iterator over target indices
keys(Int, db)     # idem

The method OIFITS.get_column is a helper to recover a single target field as a vector:

OIFITS.get_column([T,] db, col)

yields the column col of an OI-FITS data-block db. Column is identified by col which is either sym or Val(sym) where sym is the symbolic name of the corresponding field in OITargetEntry. Optional argument T is to specify the element type of the returned array.

To build an OI_TARGET instance, you may provide the list of targets and the revision number:

OI_TARGET(lst=OITargetEntry[]; revn=0)

yields an OI_TARGET data-block. Optional argument lst is a vector of OITargetEntry specifying the targets (none by default). Keyword revn specifies the revision number.

A target entry may be constructed by specifying all its fields (there are many) by keywords, all of which but category are mandatory:

x = OITargetEntry(;
        target_id ::Integer,
        target    ::AbstractString,
        raep0     ::AbstractFloat,
        decep0    ::AbstractFloat,
        equinox   ::AbstractFloat,
        ra_err    ::AbstractFloat,
        dec_err   ::AbstractFloat,
        sysvel    ::AbstractFloat,
        veltyp    ::AbstractString,
        veldef    ::AbstractString,
        pmra      ::AbstractFloat,
        pmdec     ::AbstractFloat,
        pmra_err  ::AbstractFloat,
        pmdec_err ::AbstractFloat,
        parallax  ::AbstractFloat,
        para_err  ::AbstractFloat,
        spectyp   ::AbstractString,
        category  ::AbstractString = "")

It is also possible to specify another target entry, say ref, which is used as a template: any unspecified keyword is assume to have the same value as in ref:

x = OITargetEntry(ref;
        target_id = ref.target_id,
        target    = ref.target,
        ...)

Note that, when an OI_TARGET instance is pushed in a data-set, target identifiers (field target_id) are automatically rewritten to be identical to the index in the list of targets of the data-set.

Building of data-sets

Pushing data-blocks to data-sets

Reading an OI-FITS file is the easiest way to define a data-set but a new OI-FITS data-set may be built by creating an empty data-set with OIDataSet(), and then pushing OI-FITS data-blocks in order with push!(...). Indeed, in order to ensure the consistency of a data-set, it is required to push the dependencies (OI_TARGET, OI_ARRAY, OI_WAVELENGTH, and OI_CORR data-blocks) before the data-blocks containing measurements (OI_VIS, OI_VIS2, OI_T3, OI_FLUX, and OI_INSPOL) that may refer to them.

For example, building a new data-set, say ds, looks like:

ds = OIDataSet() # create empty data-set
push!(ds, arr)   # push OI_ARRAY data-block(s)
push!(ds, ins)   # push OI_WAVELENGTH data-block(s)
push!(ds, cor)   # push OI_CORR data-block(s)
push!(ds, tgt)   # push OI_TARGET data-block
push!(ds, db1)   # push data
push!(ds, db2)   # push more data
push!(ds, db3)   # push even more data
...

with the dependencies:

arr an OI_ARRAY instance defining the interferometric array (zero or any number of such instances may be pushed),

ins an OI_WAVELENGTH instance defining the instrument (several such instances can be pushed),

cor an OI_COORREL instance defining the correlations (zero or any number of such instances can be pushed),

tgt an OI_TARGET instance defining the list of observed targets (at least one such instance is required, if more such instances are pushed in the same data-set, they are merged in a single one);

and where db1, db2, db3, etc., are instances of OI_VIS, OI_VIS2, OI_T3, OI_FLUX, or OI_INSPOL that provide measurements.

You may push all data-blocks in a single push! call:

ds = push!(OIDataSet(), arr, ins, cor, tgt, d1, db2, ...)

and the following shortcut is implemented:

ds = OIDataSet(arr, ins, cor, tgt, d1, db2, ...)

These two are equivalent to the multi-line example above, but remember that pushing data-blocks in order (i.e., dependencies before they may be referenced) is required to have a consistent data-set. Apart from this constraint, dependencies may be pushed in any order before the data-blocks with measurements and data-blocks with measurements can be be pushed in any order after dependencies.

As a benefit of the constraint of pushing data-blocks in order, data-blocks with dependencies are automatically linked to these dependencies when pushed on the data-set (which implies that the dependencies already exist in the data-set). This allows for syntaxic sugar like:

ds.vis2[i].eff_wave # the wavelengths of the i-th OI_VIS2 data-block in ds
ds.t3[i].array      # the interferometric array for the i-th OI_T3 data-block in ds
ds.vis[i].instr     # the instrument used for the i-th OI_VIS data-block in ds

Without linked dependencies, the first above example would require to (1) find in the data-set ds the OI_WAVELENGTH instance, say ins, whose name is matching ds.vi2[i].insname and (2) extract the field eff_wave of ins. The latter step is as simple as ins.eff_wave but the former one has some overheads and scales as O(n) with n the number of OI_WAVELENGTH instances in the data-set.

Since an OI-FITS data-set has a single list of targets (an OI_TARGET instance accessible via ds.target), a mean to merge list of targets had to de defined. The adopted rule is pretty simple:

The target_id field of any data-block that is part of a data-set corresponds to the index of the target entry in the list of targets stored by the data-set.

As a consequence, whenever a data-block is pushed into a data-set, the target identifiers of the data-block have to be rewritten according to this rule. Of course this does not apply for data-blocks with no target_id field such as OI_ARRAY, OI_WAVELENGTH, and OI_CORR.

To summarize, here is what happens under the hood when a data-block db is pushed into a data-set ds:

When an OI_ARRAY, OI_WAVELENGTH, or OI_CORR instance db is pushed in a data-set ds, it is appended to the corresponding list (ds.array, ds.instr, or ds.correl) unless this list already has an entry with a name matching db.name. In this latter case, nothing is done unless that an assertion exception is thrown if the two data-blocks whose names are matching do not have the same contents (to prevent building inconsistent data-sets).

When an OI_TARGET instance is pushed in a data-set, the new targets (according to their names) are appended to the list of targets in the data-set and their identifiers set to their index in this list. This also re-initializes an internal dictionary used to perform the conversion from all the target identifiers of the OI_TARGET instance that has been pushed to the target identifiers in the data-set. Until it is reinitialized (by pushing another OI_TARGET instance), this mapping is used to rewrite the target identifiers of subsequent data-blocks pushed in the data-set.

When an OI_VIS, OI_VIS2, OI_T3, OI_FLUX, or OI_INSPOL instance db is pushed in a data-set ds, it is appended to the corresponding list (ds.vis, ds.vis2, db.t3, db.flux, or ds.inspol), after it has been linked to its dependencies (OI_ARRAY, OI_WAVELENGTH, etc., which must already exist in the data-set), and its target identifiers have been rewritten according to the mapping defined by the last OI_TARGET instance previously pushed to the data-set. Rewriting of the target identifiers may be avoided by using the keyword rewrite_target_id=false, this assumes that the target identifiers in the pushed data-block are already set according to the index in the list of targets ds.target.

Pushing a data-block in a data-set does check the consistency of the data-block. This is to allow for building the data-blocks step by step so that they not need to be consistent at all times (just when pushed into a data-set).

Pushing a data-block in a data-set lefts the data-block unchanged. A swallow copy of it is added to the data-blocks stored by the data-set. Most members of the pushed data-blocks are shared by the one stored by the data-set whith the notable exception of the target identifiers which are rewritten and the links to the dependencies which are updated.

While it sounds complicated, the default rule of rewriting the target identifiers just amounts to assuming that the target identifiers of OI_VIS, OI_VIS2, OI_T3, OI_FLUX, or OI_INSPOL instances pushed in a data-set refer to the last OI_TARGET instance previously pushed on the same data-set.

Pushing several groups of data-blocks, each group making a consistent data-set, in the same data-set is easy. Typically:

# First push dependencies for group 1.
push!(ds, group1_arr) # push OI_ARRAY
push!(ds, group1_ins) # push OI_INS
push!(ds, group1_cor) # push OI_CORR
push!(ds, group1_tgt) # push OI_TARGET (reinitializing target_id mapping)
# Then push data for group 1 (using current target_id mapping).
push!(ds, group1_db1)
push!(ds, group1_db2)
...
# First push dependencies for group 2.
push!(ds, group2_arr) # push OI_ARRAY
push!(ds, group2_ins) # push OI_INS
push!(ds, group2_cor) # push OI_CORR
push!(ds, group2_tgt) # push OI_TARGET (reinitializing target_id mapping)
# Then push data for group 2 (using current target_id mapping).
push!(ds, group2_db1)
push!(ds, group2_db2)
...

Since they are referenced by their names, it is not necessary to push OI_ARRAY, OI_WAVELENGTH, and OI_COORREL dependencies if they already exist in the data-set (according to their name), but it doesn't hurt. It is however mandatory to push an OI_TARGET instance with all targets and their identifiers as assumed by the subsequent data-blocks.

Merging data-sets

Two OI-FITS data-sets (or more), say A and B, can be consistently merged together by:

C = merge(A, B)

As much as possible, the resulting data-set C will share its contents with A and/or B but without affecting A and B which are guaranteed to remain unchanged. As for pushing data-blocks, the target identifiers (the target_id field) may be rewritten in the result.

Merging of data-sets assumes that the two merged data-sets are consistent and compatible. Here compatible means that targets and dependencies with matching names must have the same contents. This is checked during the merge operation.

It is also allowed to merge several data-sets and/or merge data-sets in-place:

ds = merge(ds1, ds2, ds3, ...) # merge ds1, ds2, ... in new data-set ds
merge!(ds, ds1, ds2, ds3, ...) # merge ds1, ds2, ... in existing data-set ds

Note that merge!(ds,...) yields the destination ds.

Also note that, after merging, the internal dictionary used for rewriting target identifiers is left with the mapping built from the targets of the last merged data-set.

Credits

The development of this package has received funding from the European Community's Seventh Framework Programme (FP7/2013-2016) under Grant Agreement 312430 (OPTICON).

References

Pauls, T. A., Young, J. S., Cotton, W. D., & Monnier, J. D. "A data exchange standard for optical (visible/IR) interferometry." Publications of the Astronomical Society of the Pacific, vol. 117, no 837, p. 1255 (2005). [pdf]

Duvert, G., Young, J., & Hummel, C. "OIFITS 2: the 2nd version of the Data Exchange Standard for Optical (Visible/IR) Interferometry." arXiv preprint [arXiv:1510.04556v2.04556].

Download Details:

Author: Emmt
Source Code: https://github.com/emmt/OIFITS.jl 
License: View license

#julia #data #format 

Support for OI-FITS (optical interferometry Data Format) In Julia

MachO.jl: An Implementation Of The MachO File format

MachO

Usage

MachO.jl implements the ObjFileBase interface.

To open a MachO file simply:

julia> using MachO
julia> # NOTE: ObjFileBase.readmeta is reexported by MachO
julia> h = readmeta("/usr/lib/libz.a")
Fat Mach Handle (2 architectures)
architecture 1
        cputype X86_64
         offset 0x00001000
           size 0x00015710
          align 12

architecture 2
        cputype X86
         offset 0x00017000
           size 0x000125b0
          align 12

This will return a handle to the MachO object file. If your object file contains MachO headers for multiple architectures (like in the example above). Simply index into the handle to obtain a handle for the MachO object:

julia> mh = h[1]
MachO handle (64-bit)

Accessing Load Commands

Load commands are accessed via the iteration protocol using the iterator LoadCmds. The easiest way to see all the load sections in a file is to use collect:

julia> collect(LoadCmds(h[1]))
16-element Array{Any,1}:
 0x00000020:
 Load Command (SEGMENT_64):
           name __TEXT
           addr 0
           size 73728
        fileoff 0
       filesize 73728
        maxprot rwx
       initprot rx
         nsects 6
          flags (none)
[snip]

Working with load commands

Note that the object returned by the iterator is not the load command itself, but an object also containing a reference to the object file. This is done for convenice as it prevents the need to pass the object file around at the command line.

Accessing the symbols in a segment

As with load commands, symbols are accessed via an iterator interface, however instead of passing the object handle into the iterator, it expects a load section denoting a symbol table:

julia> l = filter(x->eltype(x)==MachO.symtab_command,LoadCmds(mh)) |> first
0x000004c8:
 Load Command (SYMTAB):
         symoff 79552
          nsyms 87
         stroff 81104
        strsize 1056

julia> Symbols(l) |> collect
87-element Array{Any,1}:
 nlist_64(0x00000407,0x3c,0x00,0x0000,0x0000000005614542)
 nlist_64(0x00000004,0x0f,0x01,0x0000,0x00000000000010f0)
 nlist_64(0x0000000d,0x0f,0x01,0x0000,0x0000000000001218)
[snip]

Finding symbols by name

The symname functions can be used to get the name of a symbol:

julia> map(x->symname(l,x),Symbols(l))
87-element Array{Any,1}:
 "radr://5614542"
 "_adler32"
 "_adler32_combine"
 "_compress"

julia> filter(x->symname(l,x)=="_compress",Symbols(l)) |> first
nlist_64(0x0000001e,0x0f,0x01,0x0000,0x00000000000013a3)

Download Details:

Author: Keno
Source Code: https://github.com/Keno/MachO.jl 
License: View license

#julia #format 

MachO.jl: An Implementation Of The MachO File format

DWARF.jl: Julia Package for Parsing The DWARF File Format

DWARF

Goal

This package aims to provide a complete implementation of a decoder for the DWARF v4 (with v5 support added where already in common use and completely once release) debug information format as specified at dwarfstd.org. The APIs are designed to be usable at the REPL, even if doing so has a minor impact on achieveable performance. Nevertheless, the package should be performant enough to be used in debuggers, unwinders, etc.

In particular, this package does not provide any higher level debugging functionality.

Provided Implementations

  • DWARF DIE Trees
  • DWARF Expressions
  • DWARF Line Table
  • DWARF Call Frame Information

Download Details:

Author: Keno
Source Code: https://github.com/Keno/DWARF.jl 
License: View license

#julia #format 

DWARF.jl: Julia Package for Parsing The DWARF File Format

DocumentFormat.jl: Auto-formatter for Julia

DocumentFormat 

An auto formatter for Julia.

Installation and Usage

using Pkg
Pkg.add("DocumentFormat")
using DocumentFormat

Documentation: Dev

Overview

The main function to format code is format. When called with a string argument, that string is assumed to be code and a new string in which the code is formatted is returned. When called with an AbstractPath that points to a file, that file is being formatted. If called with an AbstractPath that points to a folder, all *.jl files in that folder are formatted.

The function isformatted checks whether a piece of code is formatted. It can be called with a string (assumed to hold code) or an AbstractPath.

This package is deprecated. Please use JuliaFormatter.jl instead!

Download Details:

Author: Julia-vscode
Source Code: https://github.com/julia-vscode/DocumentFormat.jl 
License: View license

#julia #format 

DocumentFormat.jl: Auto-formatter for Julia

Julia Implementation Of Parquet Columnar File format Reader

Parquet

Reader

A parquet file or dataset can be loaded using the read_parquet function. A parquet dataset is a directory with multiple parquet files, each of which is a partition belonging to the dataset.

read_parquet(path; kwargs...) returns a Parquet.Table or Parquet.Dataset, which is the table contained in the parquet file or dataset in an Tables.jl compatible format.

Options:

  • rows: The row range to iterate through, all rows by default. Applicable only when reading a single file.
  • filter: Filter function to apply while loading only a subset of partitions from a dataset. The path to the partition is provided as a parameter.
  • batchsize: Maximum number of rows to read in each batch (default: row count of first row group). Applied only when reading a single file, and to each file when reading a dataset.
  • use_threads: Whether to use threads while reading the file; applicable only for Julia v1.3 and later and switched on by default if julia processes is started with multiple threads.
  • column_generator: Function to generate a partitioned column when not found in the partitioned table. Parameters provided to the function: table, column index, length of column to generate. Default implementation determines column values from the table path.

The returned object is a Tables.jl compatible Table and can be converted to other forms, e.g. a DataFrames.DataFrame via

using Parquet, DataFrames
df = DataFrame(read_parquet(path))

Partitions in a parquet file or dataset can also be iterated over using an iterator returned by the Tables.partitions method.

using Parquet, DataFrames
for partition in Tables.partitions(read_parquet(path))
    df = DataFrame(partition)
    ...
end

Lower Level Reader

Load a parquet file. Only metadata is read initially, data is loaded in chunks on demand. (Note: ParquetFiles.jl also provides load support for Parquet files under the FileIO.jl package.)

Parquet.File represents a Parquet file at path open for reading.

Parquet.File(path) => Parquet.File

Parquet.File keeps a handle to the open file and the file metadata and also holds a weakly referenced cache of page data read. If the parquet file references other files in its metadata, they will be opened as and when required for reading and closed when they are not needed anymore.

The close method closes the reader, releases open files and makes cached internal data structures available for GC. A Parquet.File instance must not be used once closed.

julia> using Parquet

julia> filename = "customer.impala.parquet";

julia> parquetfile = Parquet.File(filename)
Parquet file: customer.impala.parquet
    version: 1
    nrows: 150000
    created by: impala version 1.2-INTERNAL (build a462ec42e550c75fccbff98c720f37f3ee9d55a3)
    cached: 0 column chunks

Examine the schema.

julia> nrows(parquetfile)
150000

julia> ncols(parquetfile)
8

julia> colnames(parquetfile)
8-element Array{Array{String,1},1}:
 ["c_custkey"]
 ["c_name"]
 ["c_address"]
 ["c_nationkey"]
 ["c_phone"]
 ["c_acctbal"]
 ["c_mktsegment"]
 ["c_comment"]

julia> schema(parquetfile)
Schema:
    schema {
      optional INT64 c_custkey
      optional BYTE_ARRAY c_name
      optional BYTE_ARRAY c_address
      optional INT32 c_nationkey
      optional BYTE_ARRAY c_phone
      optional DOUBLE c_acctbal
      optional BYTE_ARRAY c_mktsegment
      optional BYTE_ARRAY c_comment
    }

The reader performs logical type conversions automatically for String (from byte arrays), decimals (from fixed length byte arrays) and DateTime (from Int96). It depends on the converted type being populated correctly in the file metadata to detect such conversions. To take care of files where such metadata is not populated, an optional map_logical_types argument can be provided while opening the parquet file. The map_logical_types value must map column names to a tuple of return type and converter functon. Return types of String and DateTime are supported as of now, and default implementations for them are included in the package.

julia> mapping = Dict(["column_name"] => (String, Parquet.logical_string));

julia> parquetfile = Parquet.File("filename"; map_logical_types=mapping);

The reader will interpret logical types based on the map_logical_types provided. The following logical type mapping methods are available in the Parquet package.

  • logical_timestamp(v; offset=Dates.Second(0)): Applicable for timestamps that are INT96 values. This converts the data read as Int128 types to DateTime types.
  • logical_string(v): Applicable for strings that are BYTE_ARRAY values. Without this, they are represented in a Vector{UInt8} type. With this they are converted to String types.
  • logical_decimal(v, precision, scale; use_float=true): Applicable for reading decimals from FIXED_LEN_BYTE_ARRAY, INT64, or INT32 values. This converts the data read as those types to Integer, Float64 or Decimal of the given precision and scale, depending on the options provided.

Variants of these methods or custom methods can also be applied by caller.

BatchedColumnsCursor

Create cursor to iterate over batches of column values. Each iteration returns a named tuple of column names with batch of column values. Files with nested schemas can not be read with this cursor.

BatchedColumnsCursor(parquetfile::Parquet.File; kwargs...)

Cursor options:

  • rows: the row range to iterate through, all rows by default.
  • batchsize: maximum number of rows to read in each batch (default: row count of first row group).
  • reusebuffer: boolean to indicate whether to reuse the buffers with every iteration; if each iteration processes the batch and does not need to refer to the same data buffer again, then setting this to true reduces GC pressure and can help significantly while processing large files.
  • use_threads: whether to use threads while reading the file; applicable only for Julia v1.3 and later and switched on by default if julia processes is started with multiple threads.

Example:

julia> typemap = Dict(["c_name"]=>(String,Parquet.logical_string), ["c_address"]=>(String,Parquet.logical_string));

julia> parquetfile = Parquet.File("customer.impala.parquet"; map_logical_types=typemap);

julia> cc = BatchedColumnsCursor(parquetfile)
Batched Columns Cursor on customer.impala.parquet
    rows: 1:150000
    batches: 1
    cols: c_custkey, c_name, c_address, c_nationkey, c_phone, c_acctbal, c_mktsegment, c_comment

julia> batchvals, state = iterate(cc);

julia> propertynames(batchvals)
(:c_custkey, :c_name, :c_address, :c_nationkey, :c_phone, :c_acctbal, :c_mktsegment, :c_comment)

julia> length(batchvals.c_name)
150000

julia> batchvals.c_name[1:5]
5-element Array{Union{Missing, String},1}:
 "Customer#000000001"
 "Customer#000000002"
 "Customer#000000003"
 "Customer#000000004"
 "Customer#000000005"

RecordCursor

Create cursor to iterate over records. In parallel mode, multiple remote cursors can be created and iterated on in parallel.

RecordCursor(parquetfile::Parquet.File; kwargs...)

Cursor options:

  • rows: the row range to iterate through, all rows by default.
  • colnames: the column names to retrieve; all by default

Example:

julia> typemap = Dict(["c_name"]=>(String,Parquet.logical_string), ["c_address"]=>(String,Parquet.logical_string));

julia> parquetfile = Parquet.File("customer.impala.parquet"; map_logical_types=typemap);

julia> rc = RecordCursor(parquetfile)
Record Cursor on customer.impala.parquet
    rows: 1:150000
    cols: c_custkey, c_name, c_address, c_nationkey, c_phone, c_acctbal, c_mktsegment, c_comment

julia> records = collect(rc);

julia> length(records)
150000

julia> first_record = first(records);

julia> isa(first_record, NamedTuple)
true

julia> propertynames(first_record)
(:c_custkey, :c_name, :c_address, :c_nationkey, :c_phone, :c_acctbal, :c_mktsegment, :c_comment)

julia> first_record.c_custkey
1

julia> first_record.c_name
"Customer#000000001"

julia> first_record.c_address
"IVhzIApeRb ot,c,E"

Writer

You can write any Tables.jl column-accessible table that contains columns of these types and their union with Missing: Int32, Int64, String, Bool, Float32, Float64.

However, CategoricalArrays are not yet supported. Furthermore, these types are not yet supported: Int96, Int128, Date, and DateTime.

Writer Example

tbl = (
    int32 = Int32.(1:1000),
    int64 = Int64.(1:1000),
    float32 = Float32.(1:1000),
    float64 = Float64.(1:1000),
    bool = rand(Bool, 1000),
    string = [randstring(8) for i in 1:1000],
    int32m = rand([missing, 1:100...], 1000),
    int64m = rand([missing, 1:100...], 1000),
    float32m = rand([missing, Float32.(1:100)...], 1000),
    float64m = rand([missing, Float64.(1:100)...], 1000),
    boolm = rand([missing, true, false], 1000),
    stringm = rand([missing, "abc", "def", "ghi"], 1000)
)

file = tempname()*".parquet"
write_parquet(file, tbl)

Download Details:

Author: JuliaIO
Source Code: https://github.com/JuliaIO/Parquet.jl 
License: View license

#julia #format 

Julia Implementation Of Parquet Columnar File format Reader

Internal format & Rules For Managing Bibliography In Pure Julia

BibInternal.jl

This package provides an internal format to translate from/to other bibliographic format.

!Warning The support for this package will move to Julia LTS once the next LTS release is available.

All entries depend on an abstract super type AbstractEntry. One generic entry GenericEntry is available to make entries without any specific rules.

Currently, only one set of entries following the BibTeX rules is available. Required and optional BibTeX fields are checked by the constructor.

Pull Requests to add more entries (or update the BibTeX rules) are welcome.

Discussions are welcome either on this GitHub repository or on the #modern-academics channel of Humans of Julia (to join the Discord server, please click the chat badge above).

Packages using BibInternal.jl

Download Details:

Author: Humans-of-Julia
Source Code: https://github.com/Humans-of-Julia/BibInternal.jl 
License: MIT license

#julia #format 

Internal format & Rules For Managing Bibliography In Pure Julia
Monty  Boehm

Monty Boehm

1660138560

Gtf-parse-off: Experiments with Parsing Gene Transfer format

Motivation

These repository contains some rough experimentation in parsing simple file formats. The main use case I'm evaluating is parsing data in the gene transfer format (GTF). The format can be parsed by a regular expression, but it's weird enough that it takes a little work.

GTF is just a stand-in for the sort simple, but not quite trivial file formats of which there are an abundance of in scientific computing. If we can build fast parsers for all of these formats with minimal effort, it would help a lot with Julia's already increasing viability.

Benchmarks

I timed the parsing of the first 100000 lines (24MB) of Ensembl's version 71 human genome annotations. The full file is 2253155 lines (502MB), so scaling each of these numbers by about 20 gives the time needed to parse the whole thing.

These are timings are not terribly scientific. E.g. I'm not counting time spent on I/O in Julia, but am in the other methods. Also, I may or may not have been watching youtube videos while I waited for the julia PCRE benchmark to finish.

LanguageMethodElapsed Seconds
CRagel table-based0.42
CRagel goto-based0.05
PythonHand written28.28
PythonRegex0.64
PyPyRegex1.09
RubyRagel table-based199.39
JuliaRagel table-based3.52
JuliaPCRE1560.50

Notes

My julia backend for ragel at dcjones/ragel-julia.

The hand-written python parser is from bcbb.

Download Details:

Author: Dcjones
Source Code: https://github.com/dcjones/gtf-parse-off 

#julia #parsed #format 

Gtf-parse-off: Experiments with Parsing Gene Transfer format
Monty  Boehm

Monty Boehm

1660108740

FastaIO.jl: Utilities to Read/write FASTA format Files in Julia

FastaIO.jl

Utilities to read/write FASTA format files in Julia.

Installation and usage

Installation

To install the module, use Julia's package manager: start pkg mode by pressing ] and then enter:

(v1.3) pkg> add FastaIO

Dependencies will be installed automatically. The module can then be loaded like any other Julia module:

julia> using FastaIO

Documentation

  • STABLEmost recently tagged version of the documentation.
  • DEVin-development version of the documentation.

See also the examples in the examples/ directory.

Download Details:

Author: Carlobaldassi
Source Code: https://github.com/carlobaldassi/FastaIO.jl 
License: View license

#julia #format #files 

FastaIO.jl: Utilities to Read/write FASTA format Files in Julia