Echopraxia: Java Logging API with Structured Logging and Conditional

Echopraxia

Echopraxia is a Java logging API designed around structured logging, rich context, and conditional logging. There are Logback and Log4J2 implementations, but Echopraxia's API is completely dependency-free, meaning it can be implemented with any logging API, i.e. jboss-logging, JUL, JEP 264, or even directly.

Echopraxia is a sequel to the Scala logging API Blindsight, hence the name: "Echopraxia is the involuntary repetition or imitation of an observed action."

Echopraxia is based around several main concepts that build and leverage on each other:

  • Structured Logging (API based around structured fields and values)
  • Contextual Logging (API based around building state in loggers)
  • Conditions (API based around context-aware functions and dynamic scripting)
  • Semantic Logging (API based around typed arguments)
  • Fluent Logging (API based around log entry builder)
  • Filters (pipeline for adding fields and conditions to loggers)

Although Echopraxia is tied on the backend to an implementation, it is designed to hide implementation details from you, just as SLF4J hides the details of the logging implementation. For example, logstash-logback-encoder provides Markers or StructuredArguments, but you will not see them in the API. Instead, Echopraxia works with independent Field and Value objects that are converted by a CoreLogger provided by an implementation LogstashCoreLogger which converts fields into StructuredArguments on the backend.

Please see the blog posts for more background on logging stuff.

Examples

Simple examples and integrations with dropwizard metrics and OSHI are available at echopraxia-examples.

For a web application example, see this Spring Boot Project.

Statement of Intent

Echopraxia is not a replacement for SLF4J. It is not an attempt to compete with Log4J2 API, JUL, commons-logging for the title of "one true logging API" and restart the logging mess. SLF4J won that fight a long time ago.

Echopraxia is a structured logging API. It is an appropriate solution when you control the logging implementation and have decided you're going to do structured logging, e.g. a web application where you've decided to use logstash-logback-encoder already.

SLF4J is an appropriate solution when you do not control the logging output, e.g. in an open-source library that could be used in arbitrary situations by anybody.

logging.png

Echopraxia is best described as a specialization or augmentation for application code -- as you're building framework support code for your application and build up your domain objects, you can write custom field builders, then log everywhere in your application with a consistent schema.

Why Structured Logging?

Structured logging enables logs to be queried as semi-structured. There are other structured logging frameworks, like Structlog (Python), Ruby-Cabin (Ruby), Logrus (Go), and Serilog (C#).

Ruby-Cabin has the best take on this:

Structured data means you don't need crazy regular expression skills to make sense of logs.

You can read more about structured logging here.

Why Conditions?

Conditions address the challenge of "whether-to-log", which concerns with dynamically adjusting the degree of logging in response to the runtime requirements.

A proof of concept of dynamic debug logging using Echopraxia is here .

A Comprehensive Survey of Logging in Software and The Bones of the System: A Study of Logging and Telemetry at Microsoft are great discussions of the implication of being able to adjust logging conditions at runtime.

Benchmarks

Benchmarks are available at BENCHMARKS.md.

Please be aware that how fast and how much you can log is dramatically impacted by your use of an asynchronous appender, your available I/O, your storage, and your ability to manage and process logs.

Logstash

There is a Logback implementation based around logstash-logback-encoder. This library does not provide a front end logger API, so you must pick (or create) one yourself, i.e. normal, async, fluent, or semantic.

Maven:

<dependency>
  <groupId>com.tersesystems.echopraxia</groupId>
  <artifactId>logstash</artifactId>
  <version><VERSION></version>
</dependency>

Gradle:

implementation "com.tersesystems.echopraxia:logstash:<VERSION>" 

Log4J

There is a Log4J implementation that works with the JSON Template Layout. This provides a core logger implementation but does not provide a user visible logging API.

Maven:

<dependency>
  <groupId>com.tersesystems.echopraxia</groupId>
  <artifactId>log4j</artifactId>
  <version><VERSION></version>
</dependency>

Gradle:

implementation "com.tersesystems.echopraxia:log4j:<VERSION>" 

You will need to integrate the com.tersesystems.echopraxia.log4j.layout package into your log4j2.xml file, e.g. by using the packages attribute, and add an EventTemplateAdditionalField element:

<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN" packages="com.tersesystems.echopraxia.log4j.layout">
    <Appenders>
        <Console name="Console" target="SYSTEM_OUT" follow="true">
            <JsonTemplateLayout eventTemplateUri="classpath:LogstashJsonEventLayoutV1.json">
                <EventTemplateAdditionalField
                        key="fields"
                        format="JSON"
                        value='{"$resolver": "echopraxiaFields"}'/>
            </JsonTemplateLayout>
        </Console>
    </Appenders>
    <Loggers>
        <Root level="info">
            <AppenderRef ref="Console" />
        </Root>
    </Loggers>
</Configuration>

If you want to separate the context fields from the argument fields, you can define them separately:

<JsonTemplateLayout eventTemplateUri="classpath:LogstashJsonEventLayoutV1.json">
    <EventTemplateAdditionalField
            key="arguments"
            format="JSON"
            value='{"$resolver": "echopraxiaArgumentFields"}'/>
    <EventTemplateAdditionalField
            key="context"
            format="JSON"
            value='{"$resolver": "echopraxiaContextFields"}'/>
</JsonTemplateLayout>

Unfortunately, I don't know of a way to "flatten" fields so that they show up on the root object instead of under an additional field. If you know how to do this, let me know!

Basic Usage

For most, you will be working with the basic logger, which uses a pluggable FieldBuilder. The logger API is a separate dependency -- if you want, you can easily create your own custom logger, so the logger is not packaged with the API.

Maven:

<dependency>
  <groupId>com.tersesystems.echopraxia</groupId>
  <artifactId>logger</artifactId>
  <version><VERSION></version>
</dependency>

Gradle:

implementation "com.tersesystems.echopraxia:logger:<VERSION>" 

Add the import:

import com.tersesystems.echopraxia.*;

First you define a logger (usually in a controller or singleton -- getClass() is particularly useful for abstract controllers):

final Logger<?> basicLogger = LoggerFactory.getLogger(getClass());

Logging simple messages and exceptions are done as in SLF4J:

try {
  ...
  basicLogger.info("Simple message");
} catch (Exception e) {
  basicLogger.error("Error message", e);  
}

However, when you log arguments, you pass a function which provides you with a field builder and returns a FieldBuilderResult -- a Field is a FieldBuilderResult, so you can do:

basicLogger.info("Message name {}", fb -> fb.string("name", "value"));

If you are returning multiple fields, then using fb.list will return a FieldBuilderResult:

basicLogger.info("Message name {} age {}", fb -> fb.list(
  fb.string("name", "value"),
  fb.number("age", 13)
));

And fb.list can take many inputs as needed, for example a stream:

String[]
basicLogger.info("Message name {}", fb -> {
  Stream<Field> fieldStream = ...;
  fb.list(arrayOfFields);
});

You can log multiple arguments and include the exception if you want the stack trace:

basicLogger.info("Message name {}", fb -> fb.list(
  fb.string("name", "value"),
  fb.exception(e)
));

In older versions, fb.only() was required to convert a Field -- this is no longer required, but a FieldBuilderWithOnly interface is available to maintain those methods.

Note that unlike SLF4J, you don't have to worry about including the exception as an argument "swallowing" the stacktrace. If an exception is present, it's always applied to the underlying logger.

So far so good. But logging strings and numbers can get tedious. Let's go into custom field builders.

Custom Field Builders

Echopraxia lets you specify custom field builders whenever you want to log domain objects:

import com.tersesystems.echopraxia.api.*;

public class BuilderWithDate implements FieldBuilder {
  public BuilderWithDate() {}

  // Renders a date as an ISO 8601 string.
  public Value.StringValue dateValue(Date date) {
    return Value.string(DateTimeFormatter.ISO_INSTANT.format(date.toInstant()));
  }

  public Field date(String name, Date date) {
    return value(name, dateValue(date));
  }
}

And now you can render a date automatically:

Logger<BuilderWithDate> dateLogger = basicLogger.withFieldBuilder(BuilderWithDate.class);
dateLogger.info("Date {}", fb -> fb.date("creation_date", new Date()));

This also applies to more complex objects. In the custom field builder example, the Person class is rendered using a custom field builder:

public class PersonFieldBuilder implements FieldBuilder {

  // Renders a `Person` as an object field.
  public Field person(String fieldName, Person p) {
    return keyValue(fieldName, personValue(p));
  }

  public Value<?> personValue(Person p) {
    if (p == null) {
      return Value.nullValue();
    }
    Field name = string("name", p.name());
    Field age = number("age", p.age());
    // optional returns either an object value or null value, keyValue is untyped
    Field father = keyValue("father", Value.optional(p.getFather().map(this::personValue)));
    Field mother = keyValue("mother", Value.optional(p.getMother().map(this::personValue)));
    Field interests = array("interests", p.interests());
    return Value.object(name, age, father, mother, interests);
  }
}

And then you can do the same by calling fb.person:

Person user = ...
Logger<PersonFieldBuilder> personLogger = basicLogger.withFieldBuilder(PersonFieldBuilder.class);
personLogger.info("Person {}", fb -> fb.person("user", user));

Custom Logger Factories

If you are using a particular set of field builders for your domain and want them available by default, it's easy to create your own logger with your own field builder, using the support classes and interfaces.

Creating your own logger will also remove the type parameter from your code, so you don't have to type Logger<?> everywhere, and allow you to create custom methods that leverage field builders.

If you want to make sure your logger is the only one available, you should import only the API:

Maven:

<dependency>
  <groupId>com.tersesystems.echopraxia</groupId>
  <artifactId>api</artifactId>
  <version><VERSION></version>
</dependency>

Gradle:

implementation "com.tersesystems.echopraxia:api:<VERSION>" 

And then continuing on from the custom field builder example, you can build a PersonLogger:

import com.tersesystems.echopraxia.api.*;

public final class PersonLogger extends AbstractLoggerSupport<PersonLogger, PersonFieldBuilder>
  implements DefaultLoggerMethods<PersonFieldBuilder> {
  private static final String FQCN = PersonLogger.class.getName();

  protected PersonLogger(
    @NotNull CoreLogger core, @NotNull PersonFieldBuilder fieldBuilder, Class<?> selfType) {
    super(core, fieldBuilder, selfType);
  }

  public void info(@Nullable String message, Person person) {
    // when using custom methods, you must specify the caller as the class it's defined in.
    this.core().withFQCN(FQCN).log(Level.INFO, message,
      fb -> fb.person("person", person), fieldBuilder);
  }

  @Override
  protected @NotNull PersonLogger newLogger(CoreLogger core) {
    return new PersonLogger(core, fieldBuilder(), PersonLogger.class);
  }

  @Override
  protected @NotNull PersonLogger neverLogger() {
    return new PersonLogger(
      core.withCondition(Condition.never()), fieldBuilder(), PersonLogger.class);
  }
}

and a custom logger factory:

public final class PersonLoggerFactory {

  private static final PersonFieldBuilder myFieldBuilder = new PersonFieldBuilder();

  // the class containing the error/warn/info/debug/trace methods
  private static final String FQCN = DefaultLoggerMethods.class.getName();

  public static PersonLogger getLogger(Class<?> clazz) {
    return getLogger(CoreLoggerFactory.getLogger(FQCN, clazz.getName()));
  }

  public static PersonLogger getLogger(String name) {
    return getLogger(CoreLoggerFactory.getLogger(FQCN, name));
  }

  public static PersonLogger getLogger() {
    return getLogger(CoreLoggerFactory.getLogger(FQCN, Caller.resolveClassName()));
  }

  public static PersonLogger getLogger(@NotNull CoreLogger core) {
    return new PersonLogger(core, myFieldBuilder, PersonLogger.class);
  }
}

and then you can log a person as a raw parameter:

PersonLogger logger = PersonLoggerFactory.getLogger();
Person abe = ...
logger.info("Best person: {}", abe);

Generally loggers should be final, and any common functionality should be moved out to interfaces you can share. This is because subclassing can have an impact on JVM optimizations, and can make returning specific types from with* methods more complicated.

Nulls and Exceptions

By default, values are @NotNull, and passing in null to values is not recommended. If you want to handle nulls, you can extend the field builder as necessary:

public interface NullableFieldBuilder extends FieldBuilder {
  // extend as necessary
  default Field nullableString(String name, String nullableString) {
    Value<?> nullableValue = (value == null) ? Value.nullValue() : Value.string(nullableString);
    return keyValue(name, nullableValue);
  }
}

Field names are never allowed to be null. If a field name is null, it will be replaced at runtime with unknown-echopraxia-N where N is an incrementing number.

logger.info("Message name {}", fb -> 
  fb.string(null, "some-value") // null field names not allowed
);

Because a field builder function runs in a closure, if an exception occurs it will be caught by the default thread exception handler. If it's the main thread, it will print to console and terminate the JVM, but other threads will swallow the exception whole. Consider setting a default thread exception handler that additionally logs, and avoid uncaught exceptions in field builder closures:

logger.info("Message name {}", fb -> {
  String name = methodThatThrowsException(); // BAD
  return fb.string(name, "some-value");
});

Instead, only call field builder methods inside the closure and keep any construction logic outside:

String name = methodThatThrowsException(); // GOOD
logger.info("Message name {}", fb -> {
  return fb.string(name, "some-value");
});

Context

You can also add fields directly to the logger using logger.withFields for contextual logging:

Logger<?> loggerWithFoo = basicLogger.withFields(fb -> fb.string("foo", "bar"));

// will log "foo": "bar" field in a JSON appender.
loggerWithFoo.info("JSON field will log automatically") 

This works very well for HTTP session and request data such as correlation ids.

One thing to be aware of that the popular idiom of using public static final Logger<?> logger can be limiting in cases where you want to include context data. For example, if you have a number of objects with their own internal state, it may be more appropriate to create a logger field on the object.

public class PlayerData {

  // the date is scoped to an instance of this player
  private Date lastAccessedDate = new Date();

  // logger is not static because lastAccessedDate is an instance variable
  private final Logger<BuilderWithDate> logger =
      LoggerFactory.getLogger()
          .withFieldBuilder(BuilderWithDate.class)
          .withFields(fb -> fb.date("last_accessed_date", lastAccessedDate));

}

Because values may be tied to state or variables that can change between method calls, the function call made by withFields is call by name i.e. the function is called on every logging statement for evaluation against any conditions and the logging statement will contain whatever the current value of lastAccessedDate at evaluation.

It's important to note that context evaluation only happens after enabled checks. For example, isInfoEnabled will not trigger context evaluation by itself. However, any implementation specific markers attached to the context will be passed in for the enabled check, as both Logback and Log4J incorporate marker checks in isEnabled.

// will not evaluate context fields
// will evaluate any impl-specific markers (Logback or Log4J)
boolean enabled = logger.isInfoEnabled();

The way that context works in conjunction with conditions is more involved, and is covered in the conditions section.

Thread Context

You can also resolve any fields in Mapped Diagnostic Context (MDC) into fields, using logger.withThreadContext(). This method provides a pre-built function that calls fb.string for each entry in the map.

Because MDC is thread local, if you pass the logger between threads or use asynchronous processing i.e. CompletionStage/CompletableFuture, you may have inconsistent results.

org.slf4j.MDC.put("mdckey", "mdcvalue");
myLogger.withThreadContext().info("This statement has MDC values in context");

This method is call by name, and so will provide the MDC state as fields at the time the logging statement is evaluated.

Thread Safety

Thread safety is something to be aware of when using context fields. While fields are thread-safe and using a context is far more convenient than using MDC, you do still have to be aware when you are accessing non-thread safe state.

For example, SimpleDateFormat is infamously not thread-safe, and so the following code is not safe to use in a multi-threaded context:

private final static DateFormat df = new SimpleDateFormat("yyyyMMdd");

// UNSAFE EXAMPLE
private static final Logger<?> logger =
        LoggerFactory.getLogger()
        .withFields(fb -> fb.string("unsafe_date", df.format(new Date())));

Conditions

Logging conditions can be handled gracefully using Condition functions. A Condition will take a Level and a LoggingContext which will return the fields of the logger.

final Condition errorCondition = new Condition() {
  @Override
  public boolean test(Level level, LoggingContext context) {
    return level.equals(Level.ERROR);
  }
};

Conditions can be used either on the logger, on the statement, or against the predicate check.

There are two elemental conditions, Condition.always() and Condition.never(). Echopraxia has optimizations for conditions; it will treat Condition.always() as a no-op, and return a NeverLogger that has no operations for logging. The JVM can recognize that logging has no effect at all, and will eliminate the method call as dead code.

Conditions are a great way to manage diagnostic logging in your application with more flexibility than global log levels can provide. Consider enabling setting your application logging to DEBUG i.e. <logger name="your.application.package" level="DEBUG"/> and using conditions to turn on and off debugging as needed. Conditions come with and, or, and xor functionality, and can provide more precise and expressive logging criteria than can be managed with filters and markers. This is particularly useful when combined with filters.

For example, if you want to have logging that only activates during business hours, you can use the following:

import com.tersesystems.echopraxia.Condition;

public class MyBusinessConditions {
  private static final Clock officeClock = Clock.system(ZoneId.of("America/Los_Angeles")) ;

  public Condition businessHoursOnly() {
    return Condition.operational().and(weekdays().and(from9to5()));
  }
  
  public Condition weekdays() {
    return (level, context) -> {
      LocalDate now = LocalDate.now(officeClock);
      final DayOfWeek dayOfWeek = now.getDayOfWeek();
      return ! (dayOfWeek.equals(DayOfWeek.SATURDAY) || dayOfWeek.equals(DayOfWeek.SUNDAY));
    };
  }
  
  public Condition from9to5() {
    return (level, context) -> LocalTime.now(officeClock).query(temporal -> {
      // hour is zero based, so adjust for readability
      final int hour = temporal.get(ChronoField.HOUR_OF_DAY) + 1;
      return (hour >= 9) && (hour <= 17); // 8 am to 5 pm
    });
  }
}

This is only a part of the available functionality in conditions. You can tie conditions directly to a backend, such as a database or key/value store, or trigger them to work in response to an exception or unusual metrics. See the redis example, jmx example, metrics example, and timed diagnostic example.

JSON Path

In situations where you're looking through fields for a condition, you can use JSONPath to find values from the logging context in a condition.

Tip: You can integrate IntelliJ IDEA with JSONPath using the inject language settings page and adding LoggingContext.

The context.find* methods take a class as a type, and a JSON path, which can be used to search through context fields (or arguments, if the condition is used in a logging statement).

The basic types are String, the Number subclasses such as Integer, and Boolean. If no matching path is found, an empty Optional is returned.

Optional<String> optName = context.findString("$.person.name");

This also applies to Throwable which are usually passed in as arguments:

Optional<Throwable> optThrowable = context.findThrowable();

You can treat a Throwable as a JSON object, i.e. the following will all work with the default $.exception path:

Optional<String> className = ctx.findString("$.exception.className");
Optional<String> message = ctx.findString("$.exception.message");
Optional<Throwable> cause = ctx.findThrowable("$.exception.cause");

And you can also query stack trace elements:

Optional<Map<String, ?>> stacktraceElement = ctx.findObject("$.exception.stackTrace[0]")
Optional<String> methodName = ctx.findString("$.exception.stackTrace[0].methodName");
Optional<List<?>> listOfElements = ctx.findObject("$.exception.stackTrace[5..10]")

Finding an explicitly null value returns a boolean:

// fb.nullValue("keyWithNullValue") sets an explicitly null value
boolean isNull = context.findNull("$.keyWithNullValue");

Finding an object will return a Map:

Optional<Map<String, ?>> mother = context.findObject("$.person.mother");

For a List, in the case of an array value or when using indefinite queries:

List<String> interests = context.findList("$.person.mother.interests");

You can use inline predicates, which will return a List of the results:

final Condition cheapBookCondition =
  (level, context) -> ! context.findList("$.store.book[?(@.price < 10)]").isEmpty();

The inline and filter predicates are not available for exceptions. Instead, you must use filter:

class FindException {
  void logException() {
    Condition throwableCondition =
      (level, ctx) ->
        ctx.findThrowable()
          .filter(e -> "test message".equals(e.getMessage()))
          .isPresent();
    
    logger.error(throwableCondition, "Error message", new RuntimeException("test message"));
  }
}

There are many more options available using JSONPath. You can try out the online evaluator to test out expressions.

Logger

You can use conditions in a logger, and statements will only log if the condition is met:

Logger<?> loggerWithCondition = logger.withCondition(condition);

You can also build up conditions:

Logger<?> loggerWithAandB = logger.withCondition(conditionA).withCondition(conditionB);

Conditions are only evaluated once a level/marker check is passed, so something like

loggerWithAandB.trace("some message");

will short circuit on the level check before any condition is reached.

Conditions look for fields, but those fields can come from either context or argument. For example, the following condition will log because the condition finds an argument field:

Condition cond = (level, ctx) -> ctx.findString("somename").isPresent();
logger.withCondition(cond).info("some message", fb -> fb.string("somename", "somevalue")); // matches argument

Statement

You can also use conditions in an individual statement:

logger.info(mustHaveFoo, "Only log if foo is present");

Predicates

Conditions can also be used in predicate blocks for expensive objects.

if (logger.isInfoEnabled(condition)) {
  // only true if condition and is info  
}

Conditions will only be checked after an isEnabled check is passed -- the level (and optional marker) is always checked first, before any conditions.

A condition may also evaluate context fields that are set in a logger:

// Conditions may evaluate context
Condition cond = (level, ctx) -> ctx.findString("somename").isPresent();
boolean loggerEnabled = logger
  .withFields(fb -> fb.string("somename", "somevalue"))
  .withCondition(condition)
  .isInfoEnabled();

Using a predicate with a condition does not trigger any logging, so it can be a nice way to "dry run" a condition. Note that the context evaluation takes place every time a condition is run, so doing something like this is not good:

Logger<?> loggerWithContextAndCondition =  logger
  .withFields(fb -> fb.string("somename", "somevalue"))
  .withCondition(condition);

// check evaluates context
if (loggerWithContextAndCondition.isInfoEnabled()) {
  // info statement _also_ evaluates context
  loggerWithContextAndCondition.info("some message");
}

This results in the context being evaluated both in the block and in the info statement itself, which is inefficient.

It is generally preferable to pass in a condition explicitly on the statement, as it will only evaluate once.

Logger<?> loggerWithContext = logger
  .withFields(fb -> fb.string("somename", "somevalue"))
loggerWithContext.info(condition, "message");

or just on the statement.

loggerWithContextAndCondition.info("some message");

Asynchronous Logging

By default, conditions are evaluated in the running thread. This can be a problem if conditions rely on external elements such as network calls or database lookups, or involve resources with locks.

Echopraxia provides an AsyncLogger that will evaluate conditions and log using another executor, so that the main business logic thread is not blocked on execution. All statements are placed on a work queue and run on a thread specified by the executor at a later time.

Installation

Maven:

<dependency>
  <groupId>com.tersesystems.echopraxia</groupId>
  <artifactId>async</artifactId>
  <version><VERSION></version>
</dependency>

Gradle:

implementation "com.tersesystems.echopraxia:async:<VERSION>" 

Usage

All the usual logging statements are available in AsyncLogger, i.e. logger.debug will log as usual.

However, there are no isLogging* methods in the AsyncLogger. Instead, a Consumer of LoggerHandle is used, which serves the same purpose as the if (isLogging*()) { .. } block.

import com.tersesystems.echopraxia.async.*;

AsyncLogger<?> logger = AsyncLoggerFactory.getLogger().withExecutor(loggingExecutor);
logger.info(handle -> {
  // do conditional logic that would normally happen in an if block
  // this may be expensive or blocking because it runs asynchronously
  handle.log("Message template {}", fb -> fb.string("foo", "bar");
});

In the unfortunate event of an exception, the underlying logger will be called at error level from the relevant core logger:

// only happens if uncaught exception from consumer
slf4jLogger.error("Uncaught exception when running asyncLog", cause);

One important detail is that the logging executor should be a daemon thread, so that it does not block JVM exit:

private static final Executor loggingExecutor =
  Executors.newSingleThreadExecutor(
      r -> {
        Thread t = new Thread(r);
        t.setDaemon(true); // daemon means the thread won't stop JVM from exiting
        t.setName("logging-thread");
        return t;
      });

Using a single thread executor is nice because you can keep ordering of your logging statements, but it may not scale up in production. Generally speaking, if you are CPU bound and want to distribute load over several cores, you should use ForkJoinPool.commonPool() or a bounded fork-join work stealing pool as your executor. If your conditions involve blocking, or work is IO bound, you should configure a thread pool executor.

Likewise, if your conditions involve calls to external services (for example, calling Consul or a remote HTTP service), you should consider using a failure handling library like failsafe to set up appropriate circuit breakers, bulkheads, timeouts, and rate limiters to manage interactions.

Because of parallelism and concurrency, your logging statements may not appear in order if you have multiple threads running in your executor. You can add extra fields to ensure you can reorder statements appropriately.

Putting it all together:

public class Async {
  private static final ExecutorService loggingExecutor =
      Executors.newSingleThreadExecutor(
          r -> {
            Thread t = new Thread(r);
            t.setDaemon(true); // daemon means the thread won't stop JVM from exiting
            t.setName("echopraxia-logging-thread");
            return t;
          });

  private static final Condition expensiveCondition = new Condition() {
    @Override
    public boolean test(Level level, LoggingContext context) {
      try {
        Thread.sleep(1000L);
        return true;
      } catch (InterruptedException e) {
        return false;
      }
    };
  };

  private static final AsyncLogger<?> logger = AsyncLoggerFactory.getLogger()
    .withExecutor(loggingExecutor)
    .withCondition(expensiveCondition);

  public static void main(String[] args) throws InterruptedException {
    System.out.println("BEFORE logging block");
    for (int i = 0; i < 10; i++) {
      // This should take no time on the rendering thread :-)
      logger.info("Prints out after expensive condition");
    }
    System.out.println("AFTER logging block");
    System.out.println("Sleeping so that the JVM stays up");
    Thread.sleep(1001L * 10L);
  }
}

Please see the async example for more details.

Managing Thread Local State

Note that because logging is asynchronous, you must be very careful when accessing thread local state. Thread local state associated with logging, i.e. MDC / ThreadContext is automatically carried through, but in some cases you may need to do additional work.

For example, if you are using Spring Boot and are using RequestContextHolder.getRequestAttributes() when constructing context fields, you must call RequestContextHolder.setRequestAttributes(requestAttributes) so that the attributes are available to the thread.

You can add hooks into the logger to get and set thread local state appropriately using the withThreadLocal method, which returns a Supplier<Runnable>, where the Supplier is the "get" and the Runnable is the "set". For example, in the echopraxia-spring-boot-example, request attributes are thread local and must be carried across to the logging executor:

class GreetingController {
  private static final AsyncLogger<?> asyncLogger =
    AsyncLoggerFactory.getLogger()
      .withFieldBuilder(HttpRequestFieldBuilder.class)
      .withThreadLocal(
        () -> {
          // get the request attributes in rendering thread...
          final RequestAttributes requestAttributes =
            RequestContextHolder.currentRequestAttributes();
          // ...and the "set" in the runnable will be called in the logging executor's thread
          return () -> RequestContextHolder.setRequestAttributes(requestAttributes);
        })
      .withFields(fb -> fb.requestFields(httpServletRequest()));

  private HttpServletRequest httpServletRequest() {
    return ((ServletRequestAttributes) RequestContextHolder.getRequestAttributes())
      .getRequest();
  }
}

You can call withThreadLocal multiple times, and it will compose with earlier blocks.

Managing Caller Info

Caller information -- the caller class, method, and line number -- is typically implemented in frameworks by processing a stacktrace. This is a problem for asynchronous logging, because the caller information is in a different castlethread.

Echopraxia can fix this by capturing the caller information just before starting the async thread, but it must do so on request. Crucially, although the throwable is created, the processing of stack trace elements still happens on the executor thread, ameliorating the cost of caller information.

You must specifically configure the implementation to capture async caller data.

Log4J

For Log4J, you must set includeLocation=true on the logger you want to capture caller data, and configure the appender to render caller data:

<Configuration packages="...">
  <Appenders>
    <Console name="Console" target="SYSTEM_OUT">
        <!-- must have location info enabled to get caller data -->
        <JsonTemplateLayout 
          eventTemplateUri="classpath:JsonLayout.json" 
          locationInfoEnabled="true">
            
        </JsonTemplateLayout>
    </Console>
  </Appenders>
    
  <Loggers>
    <!-- Must set includeLocation=true specifically for async caller info -->
    <Root level="debug" includeLocation="true">
      <AppenderRef ref="Console"/>
    </Root>
</Loggers>
</Configuration>      

Logback

Logback is a little more complicated, because there is no direct way to get the logging event from a logger. Instead, a special caller marker is added, and an appender is used to extract caller data from the marker and set it on the event,

To enable it, you must set the context property echopraxia.async.caller to true, and wrap your appenders with com.tersesystems.echopraxia.logstash.CallerDataAppender:

<configuration>
    
    <property scope="context" name="echopraxia.async.caller" value="true"/>

    <!-- loosen the rule so appenders use appender-refs -->
    <newRule pattern="*/appender/appender-ref"
             actionClass="ch.qos.logback.core.joran.action.AppenderRefAction"/>

    <appender name="CONSOLE" class="ch.qos.logback.core.ConsoleAppender">
        <!-- https://logback.qos.ch/manual/layouts.html#caller -->
        <encoder>
            <pattern>%date{H:mm:ss.SSS} %highlight(%-5level) [%thread]: %message%n%ex%caller{2}</pattern>
        </encoder>
    </appender>
    
    <appender name="JSON" class="net.logstash.logback.appender.LoggingEventAsyncDisruptorAppender">
      <appender class="ch.qos.logback.core.FileAppender">
        <filename>application.log</filename>
        <!-- Or https://github.com/logfellow/logstash-logback-encoder#caller-info-fields -->
        <encoder class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
          <providers>
            <!-- must add caller data to see method / line info in json -->
            <callerData/>
          </providers>
        </encoder>
      </appender>
    </appender>

    <root level="INFO">
        <!--
           CallerDataAppender sets the caller data on event from marker.
           Note this depends on setting the newRule above!
         -->
        <appender class="com.tersesystems.echopraxia.logstash.CallerDataAppender">
            <appender-ref ref="CONSOLE"/>
            <appender-ref ref="JSON"/>
        </appender>
    </root>
</configuration>

Dynamic Conditions with Scripts

One of the limitations of logging is that it's not that easy to change logging levels in an application at run-time. In modern applications, you typically have complex inputs and may want to enable logging for some very specific inputs without turning on your logging globally.

Script Conditions lets you tie your conditions to scripts that you can change and re-evaluate at runtime.

The security concerns surrounding Groovy or Javascript make them unsuitable in a logging environment. Fortunately, Echopraxia provides a Tweakflow script integration that lets you evaluate logging statements safely.

Tweakflow comes with a VS Code integration, a reference guide, and a standard library that contains useful regular expression and date manipulation logic.

Script Syntax

The call site for a script is the function evaluate inside a library called echopraxia. The level and context are passed through as (string level, dict ctx), where ctx is a dictionary of functions that connect back to the logging context.

Methods in the context are snake case, separated by underscores. For example, to call the equivalent of ctx.findString("$.person.name"), you would call ctx[:find_string]("$.person.name"). You can use the let construct in Tweakflow to make this clearer:

library echopraxia {
  function evaluate: (string level, dict ctx) ->
    let {
      find_string: ctx[:find_string];
    }
    find_string("$.person.name") == "testing";
}

You can get access to the raw fields of the logging context with ctx[:fields], although this will iterate over every field and is less efficient than using the find functions:

library echopraxia {
  function evaluate: (string level, dict ctx) ->
    let {
      fields: ctx[:fields]();
    }
    fields[:name] = "Will";
}

Using find_object or find_list returns the appropriate type of dict or list respectively.

library echopraxia {
  function evaluate: (string level, dict ctx) ->
    let {
      find_list: ctx[:find_list];
      interests: find_list("$.obj.interests");
    }
    interests[1] == "drink";
}

And you can use the Tweakflow standard library to allow for more advanced functionality, i.e.

import * as std from "std";
alias std.strings as str;
library echopraxia {
  function evaluate: (string level, dict ctx) ->
    let {
      find_string: ctx[:find_string];
    }
    str.lower_case(find_string("$.person.name")) == "will";
}

Creating Script Conditions

The simplest way to handle a script is to pass it in directly as a string:

import com.tersesystems.echopraxia.scripting.*;

StringBuilder b = new StringBuilder("");
b.append("library echopraxia {");
b.append("  function evaluate: (string level, dict ctx) ->");
b.append("    level == \"info\";");
b.append("}");
String scriptString = b.toString();  
Condition c = ScriptCondition.create(false, scriptString, Throwable::printStackTrace);

You can also use a Path for file based scripts:

import com.tersesystems.echopraxia.scripting.*;

Path path = Paths.get("src/test/tweakflow/condition.tf");
Condition condition = ScriptCondition.create(false, path, Throwable::printStackTrace);

Logger<?> logger = LoggerFactory.getLogger(getClass()).withCondition(condition);

Where condition.tf contains a tweakflow script, e.g.

import * as std from "std";
alias std.strings as str;

library echopraxia {
  # level: the logging level
  # ctx: the logging context
  function evaluate: (string level, dict ctx) ->
    {
      let find_string: ctx[:find_string]
    }
    str.lower_case(find_string("$.person.name")) == "will";   
}

You also have the option to store scripts in a key-value store or in a database. See the sqlite condition store example for details.

Watched Scripts

You can change file based scripts while the application is running, if they are in a directory watched by ScriptWatchService.

To configure ScriptWatchService, pass it the directory that contains your script files:

final Path watchedDir = Paths.get("/your/script/directory");
ScriptWatchService watchService = new ScriptWatchService(watchedDir);

Path filePath = watchedDir.resolve("myscript.tf");

Logger logger = LoggerFactory.getLogger();

final ScriptHandle watchedHandle = watchService.watchScript(filePath, 
        e -> logger.error("Script compilation error", e));
final Condition condition = ScriptCondition.create(watchedHandle);

logger.info(condition, "Statement only logs if condition is met!")
        
// After that, you can edit myscript.tf and the condition will 
// re-evaluate the script as needed automatically!
        
// You can delete the file, but doing so will log a warning from `ScriptWatchService`
// Recreating a deleted file will trigger an evaluation, same as modification.

// Note that the watch service creates a daemon thread to watch the directory.
// To free up the thread and stop watching, you should call close() as appropriate:
watchService.close();

Please see the scripting example for more details.

Installation

Because Scripting has a dependency on Tweakflow, it is broken out into a distinct library that you must add to your build.

Maven:

<dependency>
  <groupId>com.tersesystems.echopraxia</groupId>
  <artifactId>scripting</artifactId>
  <version><VERSION></version>
</dependency>

Gradle:

implementation "com.tersesystems.echopraxia:scripting:<VERSION>" 

Semantic Logging

Semantic Loggers are strongly typed, and will only log a particular kind of argument. All the work of field building and setting up a message is done from setup.

Basic Usage

To set up a logger for a Person with name and age properties, you would do the following:

import com.tersesystems.echopraxia.semantic.*;

SemanticLogger<Person> logger =
    SemanticLoggerFactory.getLogger(
        getClass(),
        Person.class,
        person -> "person.name = {}, person.age = {}",
        p -> fb -> fb.list(fb.string("name", p.name), fb.number("age", p.age)));

Person person = new Person("Eloise", 1);
logger.info(person);

Conditions

Semantic loggers take conditions in the same way that other loggers do, either through predicate:

if (logger.isInfoEnabled(condition)) {
  logger.info(person);
}

or directly on the method:

logger.info(condition, person);

or on the logger:

logger.withCondition(condition).info(person);

Context

Semantic loggers can add fields to context in the same way other loggers do.

SemanticLogger<Person> loggerWithContext =
  logger.withFields(fb -> fb.string("some_context_field", contextValue));

Installation

Semantic Loggers have a dependency on the api module, but do not have any implementation dependencies.

Maven:

<dependency>
  <groupId>com.tersesystems.echopraxia</groupId>
  <artifactId>semantic</artifactId>
  <version><VERSION></version>
</dependency>

Gradle:

implementation "com.tersesystems.echopraxia:semantic:<VERSION>" 

Fluent Logging

Fluent logging is done using a FluentLoggerFactory.

It is useful in situations where arguments may need to be built up over time.

import com.tersesystems.echopraxia.fluent.*;

FluentLogger<?> logger = FluentLoggerFactory.getLogger(getClass());

Person person = new Person("Eloise", 1);

logger
    .atInfo()
    .message("name = {}, age = {}")
    .argument(fb -> fb.string("name", person.name))
    .argument(fb -> fb.number("age", person.age))
    .log();

Installation

Fluent Loggers have a dependency on the api module, but do not have any implementation dependencies.

Maven:

<dependency>
  <groupId>com.tersesystems.echopraxia</groupId>
  <artifactId>fluent</artifactId>
  <version><VERSION></version>
</dependency>

Gradle:

implementation "com.tersesystems.echopraxia:fluent:<VERSION>" 

Core Logger

Because Echopraxia provides its own implementation independent API, some implementation features are not exposed normally. If you want to use implementation specific features like markers, you will need to use a core logger.

Logstash API

First, import the logstash package. This gets you access to the CoreLoggerFactory and CoreLogger, which can be cast to LogstashCoreLogger:

import com.tersesystems.echopraxia.logstash.*;
import com.tersesystems.echopraxia.api.*;

LogstashCoreLogger core = (LogstashCoreLogger) CoreLoggerFactory.getLogger();

The LogstashCoreLogger has a withMarkers method that takes an SLF4J marker:

Logger<?> logger = LoggerFactory.getLogger(
      core.withMarkers(MarkerFactory.getMarker("SECURITY")), FieldBuilder.instance);

If you have markers set as context, you can evaluate them in a condition through casting to LogstashLoggingContext:

Condition hasAnyMarkers = (level, context) -> {
   LogstashLoggingContext c = (LogstashLoggingContext) context;
   List<org.slf4j.Marker> markers = c.getMarkers();
   return markers.size() > 0;
};

If you need to get at the SLF4J logger from a core logger, you can cast and call core.logger():

Logger<?> baseLogger = LoggerFactory.getLogger();
LogstashCoreLogger core = (LogstashCoreLogger) baseLogger.core();
org.slf4j.Logger slf4jLogger = core.logger();

Log4J

Similar to Logstash, you can get access to Log4J specific features by importing

import com.tersesystems.echopraxia.log4j.*;
import com.tersesystems.echopraxia.api.*;

Log4JCoreLogger core = (Log4JCoreLogger) CoreLoggerFactory.getLogger();

The Log4JCoreLogger has a withMarker method that takes a Log4J marker:

final Marker securityMarker = MarkerManager.getMarker("SECURITY");
Logger<?> logger = LoggerFactory.getLogger(
      core.withMarker(securityMarker), FieldBuilder.instance);

If you have a marker set as context, you can evaluate it in a condition through casting to Log4JLoggingContext:

Condition hasAnyMarkers = (level, context) -> {
   Log4JLoggingContext c = (Log4JLoggingContext) context;
   Marker m = c.getMarker();
   return securityMarker.equals(m);
};

If you need to get the Log4j logger from a core logger, you can cast and call core.logger():

Logger<?> baseLogger = LoggerFactory.getLogger();
Log4JCoreLogger core = (Log4JCoreLogger) baseLogger.core();
org.apache.logging.log4j.Logger log4jLogger = core.logger();

Filters

There are times when you want to add a field or a condition to all loggers. Although you can wrap individual loggers or create your own wrapper around LoggerFactory, this can be a labor-intensive process that requires lots of code modification, and must be handled for fluent, semantic, async, and regular loggers.

Echopraxia includes filters that wrap around the CoreLogger returned by CoreLoggerFactory that provides the ability to modify the core logger from a single pipeline in the code.

For example, to add a uses_filter field to every Echopraxia logger:

package example;

import com.tersesystems.echopraxia.api.*;

public class ExampleFilter implements CoreLoggerFilter {
  @Override
  public CoreLogger apply(CoreLogger coreLogger) {
    return coreLogger
        .withFields(fb -> fb.bool("uses_filter", true), FieldBuilder.instance());
  }
}

Filters must extend the CoreLoggerFilter interface, and must have a no-args constructor.

Filters must have a fully qualified class name in the /echopraxia.properties file as a resource somewhere in your classpath. The format is filter.N where N is the order in which filters should be loaded.

filter.0=example.ExampleFilter

Filters are particularly helpful when you need to provide "out of context" information for your conditions.

For example, imagine that you have a situation in which the program uses more CPU or memory than normal in production, but works fine in a staging environment. Using OSHI and a filter, you can provide the machine statistics and evaluate with dynamic conditions.

public class SystemInfoFilter implements CoreLoggerFilter {

  private final SystemInfo systemInfo;

  public SystemInfoFilter() {
     systemInfo = new SystemInfo();
  }

  @Override
  public CoreLogger apply(CoreLogger coreLogger) {
    HardwareAbstractionLayer hardware = systemInfo.getHardware();
    GlobalMemory mem = hardware.getMemory();
    CentralProcessor proc = hardware.getProcessor();
    double[] loadAverage = proc.getSystemLoadAverage(3);

    // Now you can add conditions based on these fields, and conditionally
    // enable logging based on your load and memory!
    return coreLogger.withFields(fb -> {
        Field loadField = fb.object("load_average", //
                fb.number("1min", loadAverage[0]), //
                fb.number("5min", loadAverage[1]), //
                fb.number("15min", loadAverage[2]));
        Field memField = fb.object("mem", //
                fb.number("available", mem.getAvailable()), //
                fb.number("total", mem.getTotal()));
        Field sysinfoField = fb.object("sysinfo", loadField, memField);
        return sysinfoField;
      }, FieldBuilder.instance());
  }
}

Please see the system info example for details.

Download Details:
Author: tersesystems
Source Code: https://github.com/tersesystems/echopraxia
License: View license

#java #Logging

What is GEEK

Buddha Community

Echopraxia: Java Logging API with Structured Logging and Conditional
Samanta  Moore

Samanta Moore

1621111320

How to Validate an Email Address in Java

Email is one of the most universal tools for sharing and receiving information across the globe, with users able to connect with others online with almost no compatibility or access issues. Using this tool, information can be instantly and securely sent to partners on the other side of the world, and personal information can be verified without divulging sensitive data about a user.

Along with this widespread use, however, comes key security measures that must take place in order to ensure the safety of your organization and data. This is particularly the case when receiving email information from previously unknown sources. These risks can include phishing attempts, malware, and other threats that can cause a negative impact to your business. Furthermore, when receiving an email address via account forms and user sign up information, you need to check that the information you are given is not only correct and real, but also that it does not lead to any malicious sources that could harm your organizational security.

The following APIs will allow you to instantly verify and validate an input email address without sending any kind of notification to the email user. This will help protect your organization in the event of any threats. The goal of this tutorial is to provide you with the tools to protect your organization’s information while providing a way to verify new accounts and user information.

This will be done through three separate functions. The first will analyze the validity of an email address’ syntax. The second will check for the address’ servers, and the third performs a full email address validation including returning the results for the previous two functions.

#java #api #java api #api access keys #api tutorial #email verification #email validation #java api tutorials #java apis #api tutorials

Samanta  Moore

Samanta Moore

1621118940

How to Convert ODT Files to PDF in Java

Convert Office Open Document Text File (ODT) to standard PDF, DOCX, PNG, and JPG using Java.

Microsoft has maintained its position in the spotlight for formatted document creation and editing for good reason. Its extreme ease of use and lack of a learning curve has transformed the Microsoft Office Suite into a household name for most computer users in the United States as well as globally. This is propagated further through its almost ubiquitous use in education, as students are raised and taught using these applications.

The issue that arises with these programs, however, is their operation costs. For Apple and other non-Windows-based Operating Systems, the purchasing fees for Office can be steep. This, then, creates a paywall separating potential users from programs to which they are already accustomed. As an answer this problem, Microsoft created the OpenOffice application, which is a free, opensource version of the classic Office Suite. Within this application, you can perform almost all of the same functions as Office Suite, including creating text documents like one would with Microsoft Word. These text documents can be made using OpenOffice Writer, and are formatted using the .ODT file type. While this file type can be opened and saved using OpenOffice Writer and Word, in order to convert the file to a different format such as PDF you will need to run it through a conversion process.

The following APIs will allow you to convert your ODT documents to PDF, DOCX, PNG, and JPG for use in whatever way you need. The goal of this tutorial is to provide a simple and efficient means for instantly converting your ODT files without needing to find or download any extraneous programming.

#java #tutorial #api #pdf #java api #pdf converter #api access keys #api tutorial #java api tutorials #java apis

Tyrique  Littel

Tyrique Littel

1600135200

How to Install OpenJDK 11 on CentOS 8

What is OpenJDK?

OpenJDk or Open Java Development Kit is a free, open-source framework of the Java Platform, Standard Edition (or Java SE). It contains the virtual machine, the Java Class Library, and the Java compiler. The difference between the Oracle OpenJDK and Oracle JDK is that OpenJDK is a source code reference point for the open-source model. Simultaneously, the Oracle JDK is a continuation or advanced model of the OpenJDK, which is not open source and requires a license to use.

In this article, we will be installing OpenJDK on Centos 8.

#tutorials #alternatives #centos #centos 8 #configuration #dnf #frameworks #java #java development kit #java ee #java environment variables #java framework #java jdk #java jre #java platform #java sdk #java se #jdk #jre #open java development kit #open source #openjdk #openjdk 11 #openjdk 8 #openjdk runtime environment

Samanta  Moore

Samanta Moore

1621269360

How to Create a Barcode Image in Java

The following five APIs will allow you to create barcode images in the three previously mentioned formats.

Barcodes are used universally in commerce and retail to aid in tracking, purchasing, and inventory. This allows organizations to keep accurate records of their supplies, products, and other items that are pivotal to their operations. For international and national use, there are different barcodes that are utilized for specific contexts. For example, UPC and EAN barcodes look similar in their formatting and can be used in similar ways, but UPC is a largely North American coding system; however, both UPC and EAN are used globally. QR Codes are also used globally but are usually employed for sharing complex sets of information like item details or website links.

The following five APIs will allow you to create barcode images in the three previously mentioned formats. This will allow you to print or create packaging with your personal barcodes attached, without needing to retrieve it from an outside source. This can be especially useful for small businesses as it provides increased independence as a company and improved visual, professional appeal for your products.

#java #tutorial #api #java api #qr code #api access keys #java api tutorials #barcode generator #api tutorials #barcode api

Top 10 API Security Threats Every API Team Should Know

As more and more data is exposed via APIs either as API-first companies or for the explosion of single page apps/JAMStack, API security can no longer be an afterthought. The hard part about APIs is that it provides direct access to large amounts of data while bypassing browser precautions. Instead of worrying about SQL injection and XSS issues, you should be concerned about the bad actor who was able to paginate through all your customer records and their data.

Typical prevention mechanisms like Captchas and browser fingerprinting won’t work since APIs by design need to handle a very large number of API accesses even by a single customer. So where do you start? The first thing is to put yourself in the shoes of a hacker and then instrument your APIs to detect and block common attacks along with unknown unknowns for zero-day exploits. Some of these are on the OWASP Security API list, but not all.

Insecure pagination and resource limits

Most APIs provide access to resources that are lists of entities such as /users or /widgets. A client such as a browser would typically filter and paginate through this list to limit the number items returned to a client like so:

First Call: GET /items?skip=0&take=10 
Second Call: GET /items?skip=10&take=10

However, if that entity has any PII or other information, then a hacker could scrape that endpoint to get a dump of all entities in your database. This could be most dangerous if those entities accidently exposed PII or other sensitive information, but could also be dangerous in providing competitors or others with adoption and usage stats for your business or provide scammers with a way to get large email lists. See how Venmo data was scraped

A naive protection mechanism would be to check the take count and throw an error if greater than 100 or 1000. The problem with this is two-fold:

  1. For data APIs, legitimate customers may need to fetch and sync a large number of records such as via cron jobs. Artificially small pagination limits can force your API to be very chatty decreasing overall throughput. Max limits are to ensure memory and scalability requirements are met (and prevent certain DDoS attacks), not to guarantee security.
  2. This offers zero protection to a hacker that writes a simple script that sleeps a random delay between repeated accesses.
skip = 0
while True:    response = requests.post('https://api.acmeinc.com/widgets?take=10&skip=' + skip),                      headers={'Authorization': 'Bearer' + ' ' + sys.argv[1]})    print("Fetched 10 items")    sleep(randint(100,1000))    skip += 10

How to secure against pagination attacks

To secure against pagination attacks, you should track how many items of a single resource are accessed within a certain time period for each user or API key rather than just at the request level. By tracking API resource access at the user level, you can block a user or API key once they hit a threshold such as “touched 1,000,000 items in a one hour period”. This is dependent on your API use case and can even be dependent on their subscription with you. Like a Captcha, this can slow down the speed that a hacker can exploit your API, like a Captcha if they have to create a new user account manually to create a new API key.

Insecure API key generation

Most APIs are protected by some sort of API key or JWT (JSON Web Token). This provides a natural way to track and protect your API as API security tools can detect abnormal API behavior and block access to an API key automatically. However, hackers will want to outsmart these mechanisms by generating and using a large pool of API keys from a large number of users just like a web hacker would use a large pool of IP addresses to circumvent DDoS protection.

How to secure against API key pools

The easiest way to secure against these types of attacks is by requiring a human to sign up for your service and generate API keys. Bot traffic can be prevented with things like Captcha and 2-Factor Authentication. Unless there is a legitimate business case, new users who sign up for your service should not have the ability to generate API keys programmatically. Instead, only trusted customers should have the ability to generate API keys programmatically. Go one step further and ensure any anomaly detection for abnormal behavior is done at the user and account level, not just for each API key.

Accidental key exposure

APIs are used in a way that increases the probability credentials are leaked:

  1. APIs are expected to be accessed over indefinite time periods, which increases the probability that a hacker obtains a valid API key that’s not expired. You save that API key in a server environment variable and forget about it. This is a drastic contrast to a user logging into an interactive website where the session expires after a short duration.
  2. The consumer of an API has direct access to the credentials such as when debugging via Postman or CURL. It only takes a single developer to accidently copy/pastes the CURL command containing the API key into a public forum like in GitHub Issues or Stack Overflow.
  3. API keys are usually bearer tokens without requiring any other identifying information. APIs cannot leverage things like one-time use tokens or 2-factor authentication.

If a key is exposed due to user error, one may think you as the API provider has any blame. However, security is all about reducing surface area and risk. Treat your customer data as if it’s your own and help them by adding guards that prevent accidental key exposure.

How to prevent accidental key exposure

The easiest way to prevent key exposure is by leveraging two tokens rather than one. A refresh token is stored as an environment variable and can only be used to generate short lived access tokens. Unlike the refresh token, these short lived tokens can access the resources, but are time limited such as in hours or days.

The customer will store the refresh token with other API keys. Then your SDK will generate access tokens on SDK init or when the last access token expires. If a CURL command gets pasted into a GitHub issue, then a hacker would need to use it within hours reducing the attack vector (unless it was the actual refresh token which is low probability)

Exposure to DDoS attacks

APIs open up entirely new business models where customers can access your API platform programmatically. However, this can make DDoS protection tricky. Most DDoS protection is designed to absorb and reject a large number of requests from bad actors during DDoS attacks but still need to let the good ones through. This requires fingerprinting the HTTP requests to check against what looks like bot traffic. This is much harder for API products as all traffic looks like bot traffic and is not coming from a browser where things like cookies are present.

Stopping DDoS attacks

The magical part about APIs is almost every access requires an API Key. If a request doesn’t have an API key, you can automatically reject it which is lightweight on your servers (Ensure authentication is short circuited very early before later middleware like request JSON parsing). So then how do you handle authenticated requests? The easiest is to leverage rate limit counters for each API key such as to handle X requests per minute and reject those above the threshold with a 429 HTTP response. There are a variety of algorithms to do this such as leaky bucket and fixed window counters.

Incorrect server security

APIs are no different than web servers when it comes to good server hygiene. Data can be leaked due to misconfigured SSL certificate or allowing non-HTTPS traffic. For modern applications, there is very little reason to accept non-HTTPS requests, but a customer could mistakenly issue a non HTTP request from their application or CURL exposing the API key. APIs do not have the protection of a browser so things like HSTS or redirect to HTTPS offer no protection.

How to ensure proper SSL

Test your SSL implementation over at Qualys SSL Test or similar tool. You should also block all non-HTTP requests which can be done within your load balancer. You should also remove any HTTP headers scrub any error messages that leak implementation details. If your API is used only by your own apps or can only be accessed server-side, then review Authoritative guide to Cross-Origin Resource Sharing for REST APIs

Incorrect caching headers

APIs provide access to dynamic data that’s scoped to each API key. Any caching implementation should have the ability to scope to an API key to prevent cross-pollution. Even if you don’t cache anything in your infrastructure, you could expose your customers to security holes. If a customer with a proxy server was using multiple API keys such as one for development and one for production, then they could see cross-pollinated data.

#api management #api security #api best practices #api providers #security analytics #api management policies #api access tokens #api access #api security risks #api access keys