Cachepot: Shared Compilation Cache Written in Rust

cachepot is a ccache-like compiler caching tool. It is used as a compiler wrapper and avoids compilation when possible, storing cached results either on local disk or in one of several cloud storage backends.

It's also a fork of sccache with improved security properties and improvements all-around the code base. We upstream as much as we can back upstream, but the goals might not be a 100% match.

cachepot includes support for caching the compilation of C/C++ code, Rust, as well as NVIDIA's CUDA using nvcc.

cachepot also provides icecream-style distributed compilation (automatic packaging of local toolchains) for all supported compilers (including Rust). The distributed compilation system includes several security features that icecream lacks such as authentication, transport layer encryption, and sandboxed compiler execution on build servers. See the distributed quickstart guide for more information.

Installation

There are prebuilt x86-64 binaries available for Windows, Linux (a portable binary compiled against musl), and macOS on the releases page.

If you have a Rust toolchain installed you can install cachepot using cargo. Note that this will compile cachepot from source which is fairly resource-intensive. For CI purposes you should use prebuilt binary packages.

cargo install --git https://github.com/paritytech/cachepot

Usage

Running cachepot is like running ccache: prefix your compilation commands with it, like so:

cachepot gcc -o foo.o -c foo.c

If you want to use cachepot for caching Rust builds you can define build.rustc-wrapper in the cargo configuration file. For example, you can set it globally in $HOME/.cargo/config by adding:

[build]
rustc-wrapper = "/path/to/cachepot"

Note that you need to use cargo 1.40 or newer for this to work.

Alternatively you can use the environment variable RUSTC_WRAPPER:

RUSTC_WRAPPER=/path/to/cachepot cargo build

cachepot supports gcc, clang, MSVC, rustc, NVCC, and Wind River's diab compiler.

If you don't specify otherwise, cachepot will use a local disk cache.

cachepot works using a client-server model, where the server (which we refer to as "coordinator") runs locally on the same machine as the client. The client-server model allows the server/coordinator to be more efficient by keeping some state in memory. The cachepot command will spawn a coordinator process if one is not already running, or you can run cachepot --start-coordinator to start the background server process without performing any compilation.

You can run cachepot --stop-coordinator to terminate the coordinator. It will also terminate after (by default) 10 minutes of inactivity.

Running cachepot --show-stats will print a summary of cache statistics.

Some notes about using cachepot with Jenkins exist.

To use cachepot with cmake, provide the following command line arguments to cmake >= 3.4:

-DCMAKE_C_COMPILER_LAUNCHER=cachepot
-DCMAKE_CXX_COMPILER_LAUNCHER=cachepot

Build Requirements

cachepot is a Rust program. Building it requires cargo (and thus rustc). cachepot currently requires Rust 1.56.1. We recommend you install Rust via Rustup.

Build

If you are building cachepot for non-development purposes make sure you use cargo build --release to get optimized binaries:

cargo build --release [--no-default-features --features=s3|redis|gcs|memcached|azure]

By default, cachepot builds with support for all storage backends, but individual backends may be disabled by resetting the list of features and enabling all the other backends. Refer the Cargo Documentation for details on how to select features with Cargo.

Linux

No native dependencies.

Build with cargo and use ldd to check that the resulting binary does not depend on OpenSSL anymore.

Linux and Podman

Also you can build the repo with Parity CI Docker image:

podman run --rm -it -w /shellhere/cachepot \
                    -v "$(pwd)":/shellhere/cachepot:Z \
                    -u $(id -u):$(id -g) \
                    --userns=keep-id \
                    docker.io/paritytech/cachepot-ci:staging cargo build --locked --release
#artifacts can be found in ./target/release

If you want to reproduce other steps of CI process you can use the following guide.

macOS

No native dependencies.

Build with cargo and use otool -L to check that the resulting binary does not depend on OpenSSL anymore.

Windows

On Windows, the binary might also depend on a few MSVC CRT DLLs that are not available on older Windows versions.

It is possible to statically link against the CRT using a .cargo/config file with the following contents.

[target.x86_64-pc-windows-msvc]
rustflags = ["-Ctarget-feature=+crt-static"]

Build with cargo and use dumpbin /dependents to check that the resulting binary does not depend on MSVC CRT DLLs anymore.

Storage Options

Local

cachepot defaults to using local disk storage. You can set the CACHEPOT_DIR environment variable to change the disk cache location. By default it will use a sensible location for the current platform: ~/.cache/cachepot on Linux, %LOCALAPPDATA%\Parity\cachepot on Windows, and ~/Library/Caches/Parity.cachepot on MacOS.

The default cache size is 10 gigabytes. To change this, set CACHEPOT_CACHE_SIZE, for example CACHEPOT_CACHE_SIZE="1G".

S3

If you want to use S3 storage for the cachepot cache, you need to set the CACHEPOT_BUCKET environment variable to the name of the S3 bucket to use.

You can use AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to set the S3 credentials. Alternately, you can set AWS_IAM_CREDENTIALS_URL to a URL that returns credentials in the format supported by the EC2 metadata service, and credentials will be fetched from that location as needed. In the absence of either of these options, credentials for the instance's IAM role will be fetched from the EC2 metadata service directly.

If you need to override the default endpoint you can set CACHEPOT_ENDPOINT. To connect to a minio storage for example you can set CACHEPOT_ENDPOINT=<ip>:<port>. If your endpoint requires TLS, set CACHEPOT_S3_USE_SSL=true.

You can also define a prefix that will be prepended to the keys of all cache objects created and read within the S3 bucket, effectively creating a scope. To do that use the CACHEPOT_S3_KEY_PREFIX environment variable. This can be useful when sharing a bucket with another application.

Redis

Set CACHEPOT_REDIS to a Redis url in format redis://[:<passwd>@]<hostname>[:port][/<db>] to store the cache in a Redis instance. Redis can be configured as a LRU (least recently used) cache with a fixed maximum cache size. Set maxmemory and maxmemory-policy according to the Redis documentation. The allkeys-lru policy which discards the least recently accessed or modified key fits well for the cachepot use case.

Redis over TLS is supported. Use the rediss:// url scheme (note rediss vs redis). Append #insecure the the url to disable hostname verification and accept self-signed certificates (dangerous!). Note that this also disables SNI.

Memcached

Set CACHEPOT_MEMCACHED to a Memcached url in format tcp://<hostname>:<port> ... to store the cache in a Memcached instance.

Google Cloud Storage

To use Google Cloud Storage, you need to set the CACHEPOT_GCS_BUCKET environment variable to the name of the GCS bucket. If you're using authentication, either set CACHEPOT_GCS_KEY_PATH to the location of your JSON service account credentials or CACHEPOT_GCS_CREDENTIALS_URL with a URL that returns the oauth token. By default, CACHEPOT on GCS will be read-only. To change this, set CACHEPOT_GCS_RW_MODE to either READ_ONLY or READ_WRITE.

Azure

To use Azure Blob Storage, you'll need your Azure connection string and an existing Blob Storage container name. Set the CACHEPOT_AZURE_CONNECTION_STRING environment variable to your connection string, and CACHEPOT_AZURE_BLOB_CONTAINER to the name of the container to use. Note that cachepot will not create the container for you - you'll need to do that yourself.

Important: The environment variables are only taken into account when the server starts, i.e. only on the first run.

Overwriting the cache

In situations where the cache contains broken build artifacts, it can be necessary to overwrite the contents in the cache. That can be achieved by setting the CACHEPOT_RECACHE environment variable.

Debugging

You can set the CACHEPOT_ERROR_LOG environment variable to a path and set CACHEPOT_LOG to get the server process to redirect its logging there (including the output of unhandled panics, since the server sets RUST_BACKTRACE=1 internally).

CACHEPOT_ERROR_LOG=/tmp/cachepot_log.txt CACHEPOT_LOG=debug cachepot

You can also set these environment variables for your build system, for example

CACHEPOT_ERROR_LOG=/tmp/cachepot_log.txt CACHEPOT_LOG=debug cmake --build /path/to/cmake/build/directory

Alternatively, if you are compiling locally, you can run the server manually in foreground mode by running CACHEPOT_START_SERVER=1 CACHEPOT_NO_DAEMON=1 cachepot, and send logging to stderr by setting the CACHEPOT_LOG environment variable for example. This method is not suitable for CI services because you need to compile in another shell at the same time.

CACHEPOT_LOG=debug CACHEPOT_START_SERVER=1 CACHEPOT_NO_DAEMON=1 cachepot

Interaction with GNU make jobserver

cachepot provides support for a GNU make jobserver. When the server is started from a process that provides a jobserver, cachepot will use that jobserver and provide it to any processes it spawns. (If you are running cachepot from a GNU make recipe, you will need to prefix the command with + to get this behavior.) If the cachepot server is started without a jobserver present it will create its own with the number of slots equal to the number of available CPU cores.

This is most useful when using cachepot for Rust compilation, as rustc supports using a jobserver for parallel codegen, so this ensures that rustc will not overwhelm the system with codegen tasks. Cargo implements its own jobserver (see the information on NUM_JOBS in the cargo documentation) for rustc to use, so using cachepot for Rust compilation in cargo via RUSTC_WRAPPER should do the right thing automatically.

Known Caveats

General

  • Absolute paths to files must match to get a cache hit. This means that even if you are using a shared cache, everyone will have to build at the same absolute path (i.e. not in $HOME) in order to benefit each other. In Rust this includes the source for third party crates which are stored in $HOME/.cargo/registry/cache by default.

Rust

  • Crates that invoke the system linker cannot be cached. This includes bin, dylib, cdylib, and proc-macro crates. You may be able to improve compilation time of large bin crates by converting them to a lib crate with a thin bin wrapper.
  • Incrementally compiled crates cannot be cached. By default, in the debug profile Cargo will use incremental compilation for workspace members and path dependencies. You can disable incremental compilation.

More details on Rust caveats

Download Details:
Author: paritytech
Source Code: https://github.com/paritytech/cachepot
License: Apache-2.0 License

#blockchain  #polkadot  #smartcontract  #substrate #rust 

What is GEEK

Buddha Community

Cachepot: Shared Compilation Cache Written in Rust

Serde Rust: Serialization Framework for Rust

Serde

*Serde is a framework for serializing and deserializing Rust data structures efficiently and generically.*

You may be looking for:

Serde in action

Click to show Cargo.toml. Run this code in the playground.

[dependencies]

# The core APIs, including the Serialize and Deserialize traits. Always
# required when using Serde. The "derive" feature is only required when
# using #[derive(Serialize, Deserialize)] to make Serde work with structs
# and enums defined in your crate.
serde = { version = "1.0", features = ["derive"] }

# Each data format lives in its own crate; the sample code below uses JSON
# but you may be using a different one.
serde_json = "1.0"

 

use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize, Debug)]
struct Point {
    x: i32,
    y: i32,
}

fn main() {
    let point = Point { x: 1, y: 2 };

    // Convert the Point to a JSON string.
    let serialized = serde_json::to_string(&point).unwrap();

    // Prints serialized = {"x":1,"y":2}
    println!("serialized = {}", serialized);

    // Convert the JSON string back to a Point.
    let deserialized: Point = serde_json::from_str(&serialized).unwrap();

    // Prints deserialized = Point { x: 1, y: 2 }
    println!("deserialized = {:?}", deserialized);
}

Getting help

Serde is one of the most widely used Rust libraries so any place that Rustaceans congregate will be able to help you out. For chat, consider trying the #rust-questions or #rust-beginners channels of the unofficial community Discord (invite: https://discord.gg/rust-lang-community), the #rust-usage or #beginners channels of the official Rust Project Discord (invite: https://discord.gg/rust-lang), or the #general stream in Zulip. For asynchronous, consider the [rust] tag on StackOverflow, the /r/rust subreddit which has a pinned weekly easy questions post, or the Rust Discourse forum. It's acceptable to file a support issue in this repo but they tend not to get as many eyes as any of the above and may get closed without a response after some time.

Download Details:
Author: serde-rs
Source Code: https://github.com/serde-rs/serde
License: View license

#rust  #rustlang 

Cachepot: Shared Compilation Cache Written in Rust

cachepot is a ccache-like compiler caching tool. It is used as a compiler wrapper and avoids compilation when possible, storing cached results either on local disk or in one of several cloud storage backends.

It's also a fork of sccache with improved security properties and improvements all-around the code base. We upstream as much as we can back upstream, but the goals might not be a 100% match.

cachepot includes support for caching the compilation of C/C++ code, Rust, as well as NVIDIA's CUDA using nvcc.

cachepot also provides icecream-style distributed compilation (automatic packaging of local toolchains) for all supported compilers (including Rust). The distributed compilation system includes several security features that icecream lacks such as authentication, transport layer encryption, and sandboxed compiler execution on build servers. See the distributed quickstart guide for more information.

Installation

There are prebuilt x86-64 binaries available for Windows, Linux (a portable binary compiled against musl), and macOS on the releases page.

If you have a Rust toolchain installed you can install cachepot using cargo. Note that this will compile cachepot from source which is fairly resource-intensive. For CI purposes you should use prebuilt binary packages.

cargo install --git https://github.com/paritytech/cachepot

Usage

Running cachepot is like running ccache: prefix your compilation commands with it, like so:

cachepot gcc -o foo.o -c foo.c

If you want to use cachepot for caching Rust builds you can define build.rustc-wrapper in the cargo configuration file. For example, you can set it globally in $HOME/.cargo/config by adding:

[build]
rustc-wrapper = "/path/to/cachepot"

Note that you need to use cargo 1.40 or newer for this to work.

Alternatively you can use the environment variable RUSTC_WRAPPER:

RUSTC_WRAPPER=/path/to/cachepot cargo build

cachepot supports gcc, clang, MSVC, rustc, NVCC, and Wind River's diab compiler.

If you don't specify otherwise, cachepot will use a local disk cache.

cachepot works using a client-server model, where the server (which we refer to as "coordinator") runs locally on the same machine as the client. The client-server model allows the server/coordinator to be more efficient by keeping some state in memory. The cachepot command will spawn a coordinator process if one is not already running, or you can run cachepot --start-coordinator to start the background server process without performing any compilation.

You can run cachepot --stop-coordinator to terminate the coordinator. It will also terminate after (by default) 10 minutes of inactivity.

Running cachepot --show-stats will print a summary of cache statistics.

Some notes about using cachepot with Jenkins exist.

To use cachepot with cmake, provide the following command line arguments to cmake >= 3.4:

-DCMAKE_C_COMPILER_LAUNCHER=cachepot
-DCMAKE_CXX_COMPILER_LAUNCHER=cachepot

Build Requirements

cachepot is a Rust program. Building it requires cargo (and thus rustc). cachepot currently requires Rust 1.56.1. We recommend you install Rust via Rustup.

Build

If you are building cachepot for non-development purposes make sure you use cargo build --release to get optimized binaries:

cargo build --release [--no-default-features --features=s3|redis|gcs|memcached|azure]

By default, cachepot builds with support for all storage backends, but individual backends may be disabled by resetting the list of features and enabling all the other backends. Refer the Cargo Documentation for details on how to select features with Cargo.

Linux

No native dependencies.

Build with cargo and use ldd to check that the resulting binary does not depend on OpenSSL anymore.

Linux and Podman

Also you can build the repo with Parity CI Docker image:

podman run --rm -it -w /shellhere/cachepot \
                    -v "$(pwd)":/shellhere/cachepot:Z \
                    -u $(id -u):$(id -g) \
                    --userns=keep-id \
                    docker.io/paritytech/cachepot-ci:staging cargo build --locked --release
#artifacts can be found in ./target/release

If you want to reproduce other steps of CI process you can use the following guide.

macOS

No native dependencies.

Build with cargo and use otool -L to check that the resulting binary does not depend on OpenSSL anymore.

Windows

On Windows, the binary might also depend on a few MSVC CRT DLLs that are not available on older Windows versions.

It is possible to statically link against the CRT using a .cargo/config file with the following contents.

[target.x86_64-pc-windows-msvc]
rustflags = ["-Ctarget-feature=+crt-static"]

Build with cargo and use dumpbin /dependents to check that the resulting binary does not depend on MSVC CRT DLLs anymore.

Storage Options

Local

cachepot defaults to using local disk storage. You can set the CACHEPOT_DIR environment variable to change the disk cache location. By default it will use a sensible location for the current platform: ~/.cache/cachepot on Linux, %LOCALAPPDATA%\Parity\cachepot on Windows, and ~/Library/Caches/Parity.cachepot on MacOS.

The default cache size is 10 gigabytes. To change this, set CACHEPOT_CACHE_SIZE, for example CACHEPOT_CACHE_SIZE="1G".

S3

If you want to use S3 storage for the cachepot cache, you need to set the CACHEPOT_BUCKET environment variable to the name of the S3 bucket to use.

You can use AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to set the S3 credentials. Alternately, you can set AWS_IAM_CREDENTIALS_URL to a URL that returns credentials in the format supported by the EC2 metadata service, and credentials will be fetched from that location as needed. In the absence of either of these options, credentials for the instance's IAM role will be fetched from the EC2 metadata service directly.

If you need to override the default endpoint you can set CACHEPOT_ENDPOINT. To connect to a minio storage for example you can set CACHEPOT_ENDPOINT=<ip>:<port>. If your endpoint requires TLS, set CACHEPOT_S3_USE_SSL=true.

You can also define a prefix that will be prepended to the keys of all cache objects created and read within the S3 bucket, effectively creating a scope. To do that use the CACHEPOT_S3_KEY_PREFIX environment variable. This can be useful when sharing a bucket with another application.

Redis

Set CACHEPOT_REDIS to a Redis url in format redis://[:<passwd>@]<hostname>[:port][/<db>] to store the cache in a Redis instance. Redis can be configured as a LRU (least recently used) cache with a fixed maximum cache size. Set maxmemory and maxmemory-policy according to the Redis documentation. The allkeys-lru policy which discards the least recently accessed or modified key fits well for the cachepot use case.

Redis over TLS is supported. Use the rediss:// url scheme (note rediss vs redis). Append #insecure the the url to disable hostname verification and accept self-signed certificates (dangerous!). Note that this also disables SNI.

Memcached

Set CACHEPOT_MEMCACHED to a Memcached url in format tcp://<hostname>:<port> ... to store the cache in a Memcached instance.

Google Cloud Storage

To use Google Cloud Storage, you need to set the CACHEPOT_GCS_BUCKET environment variable to the name of the GCS bucket. If you're using authentication, either set CACHEPOT_GCS_KEY_PATH to the location of your JSON service account credentials or CACHEPOT_GCS_CREDENTIALS_URL with a URL that returns the oauth token. By default, CACHEPOT on GCS will be read-only. To change this, set CACHEPOT_GCS_RW_MODE to either READ_ONLY or READ_WRITE.

Azure

To use Azure Blob Storage, you'll need your Azure connection string and an existing Blob Storage container name. Set the CACHEPOT_AZURE_CONNECTION_STRING environment variable to your connection string, and CACHEPOT_AZURE_BLOB_CONTAINER to the name of the container to use. Note that cachepot will not create the container for you - you'll need to do that yourself.

Important: The environment variables are only taken into account when the server starts, i.e. only on the first run.

Overwriting the cache

In situations where the cache contains broken build artifacts, it can be necessary to overwrite the contents in the cache. That can be achieved by setting the CACHEPOT_RECACHE environment variable.

Debugging

You can set the CACHEPOT_ERROR_LOG environment variable to a path and set CACHEPOT_LOG to get the server process to redirect its logging there (including the output of unhandled panics, since the server sets RUST_BACKTRACE=1 internally).

CACHEPOT_ERROR_LOG=/tmp/cachepot_log.txt CACHEPOT_LOG=debug cachepot

You can also set these environment variables for your build system, for example

CACHEPOT_ERROR_LOG=/tmp/cachepot_log.txt CACHEPOT_LOG=debug cmake --build /path/to/cmake/build/directory

Alternatively, if you are compiling locally, you can run the server manually in foreground mode by running CACHEPOT_START_SERVER=1 CACHEPOT_NO_DAEMON=1 cachepot, and send logging to stderr by setting the CACHEPOT_LOG environment variable for example. This method is not suitable for CI services because you need to compile in another shell at the same time.

CACHEPOT_LOG=debug CACHEPOT_START_SERVER=1 CACHEPOT_NO_DAEMON=1 cachepot

Interaction with GNU make jobserver

cachepot provides support for a GNU make jobserver. When the server is started from a process that provides a jobserver, cachepot will use that jobserver and provide it to any processes it spawns. (If you are running cachepot from a GNU make recipe, you will need to prefix the command with + to get this behavior.) If the cachepot server is started without a jobserver present it will create its own with the number of slots equal to the number of available CPU cores.

This is most useful when using cachepot for Rust compilation, as rustc supports using a jobserver for parallel codegen, so this ensures that rustc will not overwhelm the system with codegen tasks. Cargo implements its own jobserver (see the information on NUM_JOBS in the cargo documentation) for rustc to use, so using cachepot for Rust compilation in cargo via RUSTC_WRAPPER should do the right thing automatically.

Known Caveats

General

  • Absolute paths to files must match to get a cache hit. This means that even if you are using a shared cache, everyone will have to build at the same absolute path (i.e. not in $HOME) in order to benefit each other. In Rust this includes the source for third party crates which are stored in $HOME/.cargo/registry/cache by default.

Rust

  • Crates that invoke the system linker cannot be cached. This includes bin, dylib, cdylib, and proc-macro crates. You may be able to improve compilation time of large bin crates by converting them to a lib crate with a thin bin wrapper.
  • Incrementally compiled crates cannot be cached. By default, in the debug profile Cargo will use incremental compilation for workspace members and path dependencies. You can disable incremental compilation.

More details on Rust caveats

Download Details:
Author: paritytech
Source Code: https://github.com/paritytech/cachepot
License: Apache-2.0 License

#blockchain  #polkadot  #smartcontract  #substrate #rust 

Awesome  Rust

Awesome Rust

1654894080

Serde JSON: JSON Support for Serde Framework

Serde JSON

Serde is a framework for serializing and deserializing Rust data structures efficiently and generically.

[dependencies]
serde_json = "1.0"

You may be looking for:

JSON is a ubiquitous open-standard format that uses human-readable text to transmit data objects consisting of key-value pairs.

{
    "name": "John Doe",
    "age": 43,
    "address": {
        "street": "10 Downing Street",
        "city": "London"
    },
    "phones": [
        "+44 1234567",
        "+44 2345678"
    ]
}

There are three common ways that you might find yourself needing to work with JSON data in Rust.

  • As text data. An unprocessed string of JSON data that you receive on an HTTP endpoint, read from a file, or prepare to send to a remote server.
  • As an untyped or loosely typed representation. Maybe you want to check that some JSON data is valid before passing it on, but without knowing the structure of what it contains. Or you want to do very basic manipulations like insert a key in a particular spot.
  • As a strongly typed Rust data structure. When you expect all or most of your data to conform to a particular structure and want to get real work done without JSON's loosey-goosey nature tripping you up.

Serde JSON provides efficient, flexible, safe ways of converting data between each of these representations.

Operating on untyped JSON values

Any valid JSON data can be manipulated in the following recursive enum representation. This data structure is serde_json::Value.

enum Value {
    Null,
    Bool(bool),
    Number(Number),
    String(String),
    Array(Vec<Value>),
    Object(Map<String, Value>),
}

A string of JSON data can be parsed into a serde_json::Value by the serde_json::from_str function. There is also from_slice for parsing from a byte slice &[u8] and from_reader for parsing from any io::Read like a File or a TCP stream.

use serde_json::{Result, Value};

fn untyped_example() -> Result<()> {
    // Some JSON input data as a &str. Maybe this comes from the user.
    let data = r#"
        {
            "name": "John Doe",
            "age": 43,
            "phones": [
                "+44 1234567",
                "+44 2345678"
            ]
        }"#;

    // Parse the string of data into serde_json::Value.
    let v: Value = serde_json::from_str(data)?;

    // Access parts of the data by indexing with square brackets.
    println!("Please call {} at the number {}", v["name"], v["phones"][0]);

    Ok(())
}

The result of square bracket indexing like v["name"] is a borrow of the data at that index, so the type is &Value. A JSON map can be indexed with string keys, while a JSON array can be indexed with integer keys. If the type of the data is not right for the type with which it is being indexed, or if a map does not contain the key being indexed, or if the index into a vector is out of bounds, the returned element is Value::Null.

When a Value is printed, it is printed as a JSON string. So in the code above, the output looks like Please call "John Doe" at the number "+44 1234567". The quotation marks appear because v["name"] is a &Value containing a JSON string and its JSON representation is "John Doe". Printing as a plain string without quotation marks involves converting from a JSON string to a Rust string with as_str() or avoiding the use of Value as described in the following section.

The Value representation is sufficient for very basic tasks but can be tedious to work with for anything more significant. Error handling is verbose to implement correctly, for example imagine trying to detect the presence of unrecognized fields in the input data. The compiler is powerless to help you when you make a mistake, for example imagine typoing v["name"] as v["nmae"] in one of the dozens of places it is used in your code.

Parsing JSON as strongly typed data structures

Serde provides a powerful way of mapping JSON data into Rust data structures largely automatically.

use serde::{Deserialize, Serialize};
use serde_json::Result;

#[derive(Serialize, Deserialize)]
struct Person {
    name: String,
    age: u8,
    phones: Vec<String>,
}

fn typed_example() -> Result<()> {
    // Some JSON input data as a &str. Maybe this comes from the user.
    let data = r#"
        {
            "name": "John Doe",
            "age": 43,
            "phones": [
                "+44 1234567",
                "+44 2345678"
            ]
        }"#;

    // Parse the string of data into a Person object. This is exactly the
    // same function as the one that produced serde_json::Value above, but
    // now we are asking it for a Person as output.
    let p: Person = serde_json::from_str(data)?;

    // Do things just like with any other Rust data structure.
    println!("Please call {} at the number {}", p.name, p.phones[0]);

    Ok(())
}

This is the same serde_json::from_str function as before, but this time we assign the return value to a variable of type Person so Serde will automatically interpret the input data as a Person and produce informative error messages if the layout does not conform to what a Person is expected to look like.

Any type that implements Serde's Deserialize trait can be deserialized this way. This includes built-in Rust standard library types like Vec<T> and HashMap<K, V>, as well as any structs or enums annotated with #[derive(Deserialize)].

Once we have p of type Person, our IDE and the Rust compiler can help us use it correctly like they do for any other Rust code. The IDE can autocomplete field names to prevent typos, which was impossible in the serde_json::Value representation. And the Rust compiler can check that when we write p.phones[0], then p.phones is guaranteed to be a Vec<String> so indexing into it makes sense and produces a String.

The necessary setup for using Serde's derive macros is explained on the Using derive page of the Serde site.

Constructing JSON values

Serde JSON provides a json! macro to build serde_json::Value objects with very natural JSON syntax.

use serde_json::json;

fn main() {
    // The type of `john` is `serde_json::Value`
    let john = json!({
        "name": "John Doe",
        "age": 43,
        "phones": [
            "+44 1234567",
            "+44 2345678"
        ]
    });

    println!("first phone number: {}", john["phones"][0]);

    // Convert to a string of JSON and print it out
    println!("{}", john.to_string());
}

The Value::to_string() function converts a serde_json::Value into a String of JSON text.

One neat thing about the json! macro is that variables and expressions can be interpolated directly into the JSON value as you are building it. Serde will check at compile time that the value you are interpolating is able to be represented as JSON.

let full_name = "John Doe";
let age_last_year = 42;

// The type of `john` is `serde_json::Value`
let john = json!({
    "name": full_name,
    "age": age_last_year + 1,
    "phones": [
        format!("+44 {}", random_phone())
    ]
});

This is amazingly convenient, but we have the problem we had before with Value: the IDE and Rust compiler cannot help us if we get it wrong. Serde JSON provides a better way of serializing strongly-typed data structures into JSON text.

Creating JSON by serializing data structures

A data structure can be converted to a JSON string by serde_json::to_string. There is also serde_json::to_vec which serializes to a Vec<u8> and serde_json::to_writer which serializes to any io::Write such as a File or a TCP stream.

use serde::{Deserialize, Serialize};
use serde_json::Result;

#[derive(Serialize, Deserialize)]
struct Address {
    street: String,
    city: String,
}

fn print_an_address() -> Result<()> {
    // Some data structure.
    let address = Address {
        street: "10 Downing Street".to_owned(),
        city: "London".to_owned(),
    };

    // Serialize it to a JSON string.
    let j = serde_json::to_string(&address)?;

    // Print, write to a file, or send to an HTTP server.
    println!("{}", j);

    Ok(())
}

Any type that implements Serde's Serialize trait can be serialized this way. This includes built-in Rust standard library types like Vec<T> and HashMap<K, V>, as well as any structs or enums annotated with #[derive(Serialize)].

Performance

It is fast. You should expect in the ballpark of 500 to 1000 megabytes per second deserialization and 600 to 900 megabytes per second serialization, depending on the characteristics of your data. This is competitive with the fastest C and C++ JSON libraries or even 30% faster for many use cases. Benchmarks live in the serde-rs/json-benchmark repo.

Getting help

Serde is one of the most widely used Rust libraries, so any place that Rustaceans congregate will be able to help you out. For chat, consider trying the #rust-questions or #rust-beginners channels of the unofficial community Discord (invite: https://discord.gg/rust-lang-community), the #rust-usage or #beginners channels of the official Rust Project Discord (invite: https://discord.gg/rust-lang), or the #general stream in Zulip. For asynchronous, consider the [rust] tag on StackOverflow, the /r/rust subreddit which has a pinned weekly easy questions post, or the Rust Discourse forum. It's acceptable to file a support issue in this repo, but they tend not to get as many eyes as any of the above and may get closed without a response after some time.

No-std support

As long as there is a memory allocator, it is possible to use serde_json without the rest of the Rust standard library. This is supported on Rust 1.36+. Disable the default "std" feature and enable the "alloc" feature:

[dependencies]
serde_json = { version = "1.0", default-features = false, features = ["alloc"] }

For JSON support in Serde without a memory allocator, please see the serde-json-core crate.

Link: https://crates.io/crates/serde_json

#rust  #rustlang  #encode   #json 

Joseph  Murray

Joseph Murray

1624471200

Beginner's Guide to Compilation in Java

Java applications are complied to bytecode then JIT and JVM takes care of code execution. Here you will find some insights about how JIT compiler works.

I am guessing that many of you use Java as your primary language in your day-to-day work. Have you ever thought about why HotSpot is even called HotSpot or what the Tiered Compilation is and how it relates to Java? I will answer these questions and a few others through the course of this article. I will begin this by explaining a few things about compilation itself and the theory behind it.

Turning Source Code to Machine Code

In general, we can differentiate two basic ways of translating human readable code to instructions that can be understood by our computers:

Static (native, AOT) Compilation

  • After code is written, a compiler will take it and produce a binary executable file. This file will contain set of machine code instructions targeted for particular CPU architecture. Of course the same binary should be able to run on CPUs with similar set of instructions but in more complex cases your binary may fail to run and may require recompiling to meet server requirements. We lose the ability to run on multiple platforms for the benefit of faster execution on a dedicated platform.

Interpretation

  • Already existing source code will be run and turned into binary code line by line by the interpreter while the exact line is being executed. Thanks to this feature, the application may run on every CPU that has the correct interpreter. On the other hand, it will make the execution slower than in the case of statically compiled languages. We benefit from the ability to run on multiple platforms but lose on execution time.

As you can see both types have their advantages and disadvantages and are dedicated to specific use cases and will probably fail if not used in the correct case. You may ask – if there are only two ways does it mean that Java is an interpreted or a statically compiled language?

#java #jvm #compiler #graalvm #hotspot #compilation #jit compiler #native image #aot #tiered compilation

Awesome  Rust

Awesome Rust

1651869600

Sccache: Shared Compilation Cache Written in Rust

sccache is a ccache-like compiler caching tool. It is used as a compiler wrapper and avoids compilation when possible, storing cached results either on local disk or in one of several cloud storage backends.

sccache includes support for caching the compilation of C/C++ code, Rust, as well as NVIDIA's CUDA using nvcc.

sccache also provides icecream-style distributed compilation (automatic packaging of local toolchains) for all supported compilers (including Rust). The distributed compilation system includes several security features that icecream lacks such as authentication, transport layer encryption, and sandboxed compiler execution on build servers. See the distributed quickstart guide for more information.

Installation

There are prebuilt x86-64 binaries available for Windows, Linux (a portable binary compiled against musl), and macOS on the releases page. Several package managers also include sccache packages, you can install the latest release from source using cargo, or build directly from a source checkout.

macOS

On macOS sccache can be installed via Homebrew:

brew install sccache

Windows

On Windows, sccache can be installed via scoop:

scoop install sccache

Via cargo

If you have a Rust toolchain installed you can install sccache using cargo. Note that this will compile sccache from source which is fairly resource-intensive. For CI purposes you should use prebuilt binary packages.

cargo install sccache

Usage

Running sccache is like running ccache: prefix your compilation commands with it, like so:

sccache gcc -o foo.o -c foo.c

If you want to use sccache for caching Rust builds you can define build.rustc-wrapper in the cargo configuration file. For example, you can set it globally in $HOME/.cargo/config.toml by adding:

[build]
rustc-wrapper = "/path/to/sccache"

Note that you need to use cargo 1.40 or newer for this to work.

Alternatively you can use the environment variable RUSTC_WRAPPER:

export RUSTC_WRAPPER=/path/to/sccache
cargo build

sccache supports gcc, clang, MSVC, rustc, NVCC, and Wind River's diab compiler.

If you don't specify otherwise, sccache will use a local disk cache.

sccache works using a client-server model, where the server runs locally on the same machine as the client. The client-server model allows the server to be more efficient by keeping some state in memory. The sccache command will spawn a server process if one is not already running, or you can run sccache --start-server to start the background server process without performing any compilation.

You can run sccache --stop-server to terminate the server. It will also terminate after (by default) 10 minutes of inactivity.

Running sccache --show-stats will print a summary of cache statistics.

Some notes about using sccache with Jenkins are here.

To use sccache with cmake, provide the following command line arguments to cmake 3.4 or newer:

-DCMAKE_C_COMPILER_LAUNCHER=sccache
-DCMAKE_CXX_COMPILER_LAUNCHER=sccache

To generate PDB files for debugging with MSVC, you can use the /Z7 option. Alternatively, the /Zi option together with /Fd can work if /Fd names a different PDB file name for each object file created. Note that CMake sets /Zi by default, so if you use CMake, you can use /Z7 by adding code like this in your CMakeLists.txt:

if(CMAKE_BUILD_TYPE STREQUAL "Debug")
  string(REPLACE "/Zi" "/Z7" CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG}")
  string(REPLACE "/Zi" "/Z7" CMAKE_C_FLAGS_DEBUG "${CMAKE_C_FLAGS_DEBUG}")
elseif(CMAKE_BUILD_TYPE STREQUAL "Release")
  string(REPLACE "/Zi" "/Z7" CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE}")
  string(REPLACE "/Zi" "/Z7" CMAKE_C_FLAGS_RELEASE "${CMAKE_C_FLAGS_RELEASE}")
elseif(CMAKE_BUILD_TYPE STREQUAL "RelWithDebInfo")
  string(REPLACE "/Zi" "/Z7" CMAKE_CXX_FLAGS_RELWITHDEBINFO "${CMAKE_CXX_FLAGS_RELWITHDEBINFO}")
  string(REPLACE "/Zi" "/Z7" CMAKE_C_FLAGS_RELWITHDEBINFO "${CMAKE_C_FLAGS_RELWITHDEBINFO}")
endif()

By default, sccache will fail your build if it fails to successfully communicate with its associated server. To have sccache instead gracefully failover to the local compiler without stopping, set the environment variable SCCACHE_IGNORE_SERVER_IO_ERROR=1.

Build Requirements

sccache is a Rust program. Building it requires cargo (and thus rustc). sccache currently requires Rust 1.58.0. We recommend you install Rust via Rustup.

Build

If you are building sccache for non-development purposes make sure you use cargo build --release to get optimized binaries:

cargo build --release [--no-default-features --features=s3|redis|gcs|memcached|azure]

By default, sccache builds with support for all storage backends, but individual backends may be disabled by resetting the list of features and enabling all the other backends. Refer the Cargo Documentation for details on how to select features with Cargo.

Building portable binaries

When building with the dist-server feature, sccache will depend on OpenSSL, which can be an annoyance if you want to distribute portable binaries. It is possible to statically link against OpenSSL using the openssl/vendored feature.

Linux

Build with cargo and use ldd to check that the resulting binary does not depend on OpenSSL anymore.

macOS

Build with cargo and use otool -L to check that the resulting binary does not depend on OpenSSL anymore.

Windows

On Windows, the binary might also depend on a few MSVC CRT DLLs that are not available on older Windows versions.

It is possible to statically link against the CRT using a .cargo/config.toml file with the following contents.

[target.x86_64-pc-windows-msvc]
rustflags = ["-Ctarget-feature=+crt-static"]

Build with cargo and use dumpbin /dependents to check that the resulting binary does not depend on MSVC CRT DLLs anymore.

When statically linking with OpenSSL, you will need Perl available in your $PATH.

Storage Options

Local

sccache defaults to using local disk storage. You can set the SCCACHE_DIR environment variable to change the disk cache location. By default it will use a sensible location for the current platform: ~/.cache/sccache on Linux, %LOCALAPPDATA%\Mozilla\sccache on Windows, and ~/Library/Caches/Mozilla.sccache on MacOS.

The default cache size is 10 gigabytes. To change this, set SCCACHE_CACHE_SIZE, for example SCCACHE_CACHE_SIZE="1G".

S3

If you want to use S3 storage for the sccache cache, you need to set the SCCACHE_BUCKET environment variable to the name of the S3 bucket to use.

You can use AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to set the S3 credentials. Alternately, you can set AWS_IAM_CREDENTIALS_URL to a URL that returns credentials in the format supported by the EC2 metadata service, and credentials will be fetched from that location as needed. In the absence of either of these options, credentials for the instance's IAM role will be fetched from the EC2 metadata service directly.

If you need to override the default endpoint you can set SCCACHE_ENDPOINT. To connect to a minio storage for example you can set SCCACHE_ENDPOINT=<ip>:<port>. If your endpoint requires TLS, set SCCACHE_S3_USE_SSL=true.

You can also define a prefix that will be prepended to the keys of all cache objects created and read within the S3 bucket, effectively creating a scope. To do that use the SCCACHE_S3_KEY_PREFIX environment variable. This can be useful when sharing a bucket with another application.

Redis

Set SCCACHE_REDIS to a Redis url in format redis://[:<passwd>@]<hostname>[:port][/<db>] to store the cache in a Redis instance. Redis can be configured as a LRU (least recently used) cache with a fixed maximum cache size. Set maxmemory and maxmemory-policy according to the Redis documentation. The allkeys-lru policy which discards the least recently accessed or modified key fits well for the sccache use case.

Redis over TLS is supported. Use the rediss:// url scheme (note rediss vs redis). Append #insecure the the url to disable hostname verification and accept self-signed certificates (dangerous!). Note that this also disables SNI.

Memcached

Set SCCACHE_MEMCACHED to a Memcached url in format tcp://<hostname>:<port> ... to store the cache in a Memcached instance.

Google Cloud Storage

To use Google Cloud Storage, you need to set the SCCACHE_GCS_BUCKET environment variable to the name of the GCS bucket.

If you're using authentication, either:

  • Set SCCACHE_GCS_KEY_PATH to the location of your JSON service account credentials
  • (Deprecated) Set SCCACHE_GCS_CREDENTIALS_URL to a URL returning an OAuth token in non-standard {"accessToken": "...", "expireTime": "..."} format.
  • Set SCCACHE_GCS_OAUTH_URL to a URL returning an OAuth token. If you are running on a Google Cloud instance, this is of the form http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/${YOUR_SERVICE_ACCOUNT}/token

By default, SCCACHE on GCS will be read-only. To change this, set SCCACHE_GCS_RW_MODE to either READ_ONLY or READ_WRITE.

You can also define a prefix that will be prepended to the keys of all cache objects created and read within the GCS bucket, effectively creating a scope. To do that use the SCCACHE_GCS_KEY_PREFIX environment variable. This can be useful when sharing a bucket with another application.

Azure

To use Azure Blob Storage, you'll need your Azure connection string and an existing Blob Storage container name. Set the SCCACHE_AZURE_CONNECTION_STRING environment variable to your connection string, and SCCACHE_AZURE_BLOB_CONTAINER to the name of the container to use. Note that sccache will not create the container for you - you'll need to do that yourself.

You can also define a prefix that will be prepended to the keys of all cache objects created and read within the container, effectively creating a scope. To do that use the SCCACHE_AZURE_KEY_PREFIX environment variable. This can be useful when sharing a bucket with another application.

Important: The environment variables are only taken into account when the server starts, i.e. only on the first run.

Separating caches between invocations

In situations where several different compilation invocations should not reuse the cached results from each other, one can set SCCACHE_C_CUSTOM_CACHE_BUSTER to a unique value that'll be mixed into the hash. MACOSX_DEPLOYMENT_TARGET and IPHONEOS_DEPLOYMENT_TARGET variables already exhibit such reuse-suppression behaviour. There are currently no such variables for compiling Rust.

Overwriting the cache

In situations where the cache contains broken build artifacts, it can be necessary to overwrite the contents in the cache. That can be achieved by setting the SCCACHE_RECACHE environment variable.

Debugging

You can set the SCCACHE_ERROR_LOG environment variable to a path and set SCCACHE_LOG to get the server process to redirect its logging there (including the output of unhandled panics, since the server sets RUST_BACKTRACE=1 internally).

SCCACHE_ERROR_LOG=/tmp/sccache_log.txt SCCACHE_LOG=debug sccache

You can also set these environment variables for your build system, for example

SCCACHE_ERROR_LOG=/tmp/sccache_log.txt SCCACHE_LOG=debug cmake --build /path/to/cmake/build/directory

Alternatively, if you are compiling locally, you can run the server manually in foreground mode by running SCCACHE_START_SERVER=1 SCCACHE_NO_DAEMON=1 sccache, and send logging to stderr by setting the SCCACHE_LOG environment variable for example. This method is not suitable for CI services because you need to compile in another shell at the same time.

SCCACHE_LOG=debug SCCACHE_START_SERVER=1 SCCACHE_NO_DAEMON=1 sccache

Interaction with GNU make jobserver

sccache provides support for a GNU make jobserver. When the server is started from a process that provides a jobserver, sccache will use that jobserver and provide it to any processes it spawns. (If you are running sccache from a GNU make recipe, you will need to prefix the command with + to get this behavior.) If the sccache server is started without a jobserver present it will create its own with the number of slots equal to the number of available CPU cores.

This is most useful when using sccache for Rust compilation, as rustc supports using a jobserver for parallel codegen, so this ensures that rustc will not overwhelm the system with codegen tasks. Cargo implements its own jobserver (see the information on NUM_JOBS in the cargo documentation) for rustc to use, so using sccache for Rust compilation in cargo via RUSTC_WRAPPER should do the right thing automatically.

Known Caveats

General

  • Absolute paths to files must match to get a cache hit. This means that even if you are using a shared cache, everyone will have to build at the same absolute path (i.e. not in $HOME) in order to benefit each other. In Rust this includes the source for third party crates which are stored in $HOME/.cargo/registry/cache by default.

Rust

  • Crates that invoke the system linker cannot be cached. This includes bin, dylib, cdylib, and proc-macro crates. You may be able to improve compilation time of large bin crates by converting them to a lib crate with a thin bin wrapper.
  • Incrementally compiled crates cannot be cached. By default, in the debug profile Cargo will use incremental compilation for workspace members and path dependencies. You can disable incremental compilation.

More details on Rust caveats

Symbolic links

  • Symbolic links to sccache won't work. Use hardlinks: ln sccache /usr/local/bin/cc

Download Details:
Author: mozilla
Source Code: https://github.com/mozilla/sccache/
License: Apache-2.0 license

#rust  #rustlang