Nigel  Uys

Nigel Uys

1650992100

Webrtc: Pure Go Implementation Of The WebRTC API

 Pion WebRTC 
 

A pure Go implementation of the WebRTC API

Usage

Go Modules are mandatory for using Pion WebRTC. So make sure you set export GO111MODULE=on, and explicitly specify /v2 or /v3 when importing.

example applications contains code samples of common things people build with Pion WebRTC.

example-webrtc-applications contains more full featured examples that use 3rd party libraries.

awesome-pion contains projects that have used Pion, and serve as real world examples of usage.

GoDoc is an auto generated API reference. All our Public APIs are commented.

FAQ has answers to common questions. If you have a question not covered please ask in Slack we are always looking to expand it.

Now go build something awesome! Here are some ideas to get your creative juices flowing:

  • Send a video file to multiple browser in real time for perfectly synchronized movie watching.
  • Send a webcam on an embedded device to your browser with no additional server required!
  • Securely send data between two servers, without using pub/sub.
  • Record your webcam and do special effects server side.
  • Build a conferencing application that processes audio/video and make decisions off of it.
  • Remotely control a robots and stream its cameras in realtime.

Want to learn more about WebRTC?

Join our Office Hours. Come hang out, ask questions, get help debugging and hear about the cool things being built with WebRTC. We also start every meeting with basic project planning.

Check out WebRTC for the Curious. A book about WebRTC in depth, not just about the APIs. Learn the full details of ICE, SCTP, DTLS, SRTP, and how they work together to make up the WebRTC stack.

This is also a great resource if you are trying to debug. Learn the tools of the trade and how to approach WebRTC issues.

This book is vendor agnostic and will not have any Pion specific information.

Features

PeerConnection API

Connectivity

  • Full ICE Agent
  • ICE Restart
  • Trickle ICE
  • STUN
  • TURN (UDP, TCP, DTLS and TLS)
  • mDNS candidates

DataChannels

  • Ordered/Unordered
  • Lossy/Lossless

Media

Security

  • TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 and TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA for DTLS v1.2
  • SRTP_AEAD_AES_256_GCM and SRTP_AES128_CM_HMAC_SHA1_80 for SRTP
  • Hardware acceleration available for GCM suites

Pure Go

  • No Cgo usage
  • Wide platform support
    • Windows, macOS, Linux, FreeBSD
    • iOS, Android
    • WASM see examples
    • 386, amd64, arm, mips, ppc64
  • Easy to build Numbers generated on Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
    • Time to build examples/play-from-disk - 0.66s user 0.20s system 306% cpu 0.279 total
    • Time to run entire test suite - 25.60s user 9.40s system 45% cpu 1:16.69 total
  • Tools to measure performance provided

Roadmap

The library is in active development, please refer to the roadmap to track our major milestones. We also maintain a list of Big Ideas these are things we want to build but don't have a clear plan or the resources yet. If you are looking to get involved this is a great place to get started! We would also love to hear your ideas! Even if you can't implement it yourself, it could inspire others.

Community

Pion has an active community on the Slack.

Follow the Pion Twitter for project updates and important WebRTC news.

We are always looking to support your projects. Please reach out if you have something to build! If you need commercial support or don't want to use public methods you can contact us at team@pion.ly

Contributing

Check out the contributing wiki to join the group of amazing people making this project possible:

Author: Pion
Source Code: https://github.com/pion/webrtc 
License: MIT License

#go #golang #audio 

Webrtc: Pure Go Implementation Of The WebRTC API

Google Oboe Audio Engine Plugins for Flutter

flutter_oboe

Oboe audio engine plugin for android!

Getting Started

This project is a basic sample of how to use Oboe audio engine from google inside the flutter project.

Note: Oboe is not available for iOS yet so that you need to handle things seperately if you're building from iOS.

Configuration requirements for android If you're using path_provider package, make sure yoour target and compile SDLK For the android is 31 Note: Make sure you add the correct permissions or it will not record the sound.

1 . Set up the NDK path to build with android NDK for native C/C++ codes.

2 . Make sure that you have oboe and libsndfile directories inside "C:\flutter.pub-cache\hosted\pub.dartlang.org" folder or whatever your flutter installation directory is

Use this package as a library

Depend on it

Run this command:

With Flutter:

 $ flutter pub add flutter_oboe

This will add a line like this to your package's pubspec.yaml (and run an implicit flutter pub get):

dependencies:
  flutter_oboe: ^0.0.3

Alternatively, your editor might support flutter pub get. Check the docs for your editor to learn more.

Import it

Now in your Dart code, you can use:

import 'package:flutter_oboe/flutter_oboe.dart';

example/lib/main.dart

import 'dart:io';
import 'dart:math';
import 'dart:typed_data';

import 'package:flutter/material.dart';
import 'package:get/get.dart';
import 'dart:async';
import 'package:flutter/foundation.dart' show kIsWeb;

import 'package:flutter/services.dart';
import 'package:flutter_oboe/flutter_oboe.dart';
import 'package:flutter_oboe/flutter_oboe_test.dart';
import 'package:external_path/external_path.dart';
import 'package:path_provider/path_provider.dart';
import 'package:just_audio/just_audio.dart';

import 'package:permission_handler/permission_handler.dart';

void main() {
  runApp(const MyApp());
}

class MyApp extends StatefulWidget {
  const MyApp({Key? key}) : super(key: key);

  @override
  State<MyApp> createState() => _MyAppState();
}

class _MyAppState extends State<MyApp> {
  String _platformVersion = 'Unknown';
  final stream = OboeTestStream();
  final flutterOboe = FlutterOboe();
  final noise = Float32List(512);
  late Timer timer;
  bool bFileExist = false, bRecording = false, bRecordingPaused = false;
  final _player = AudioPlayer();
  late String filePath;

  Future<String> _getPath() async {
    Directory appDocDir = await getApplicationDocumentsDirectory();
    return appDocDir.path;
  }

  @override
  void initState() {
    super.initState();
    initPlatformState();

    //Oboe Test
    for (var i = 0; i < noise.length; i++) {
      noise[i] = sin(8 * pi * i / noise.length);
    }
  }

  @override
  void dispose() {
    stream.dispose();
    flutterOboe.dispose();
    super.dispose();
  }

  // Platform messages are asynchronous, so we initialize in an async method.
  Future<void> initPlatformState() async {
    String platformVersion;
    // Platform messages may fail, so we use a try/catch PlatformException.
    // We also handle the message potentially returning null.
    try {
      platformVersion = await FlutterOboeTest.platformVersion ?? 'Unknown platform version';
    } on PlatformException {
      platformVersion = 'Failed to get platform version.';
    }

    if (!kIsWeb) {
      var status = await Permission.microphone.request();
      if (status != PermissionStatus.granted) {
        print('Microphone permission is needed');
      }
    }

    //Get the recording path
    filePath = await _getPath() + '/temp_recording.wav';

    // If the widget was removed from the tree while the asynchronous platform
    // message was in flight, we want to discard the reply rather than calling
    // setState to update our non-existent appearance.
    if (!mounted) return;

    setState(() {
      _platformVersion = platformVersion;
    });
  }

  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      home: Scaffold(
        appBar: AppBar(
          title: const Text('Oboe Plugin'),
        ),
        body: SizedBox(
          width: double.infinity,
          child: Column(
            mainAxisAlignment: MainAxisAlignment.center,
            crossAxisAlignment: CrossAxisAlignment.center,
            children: [
              Text('Running on: $_platformVersion').paddingOnly(bottom: 5),
              Text('Sample rate : ${stream.getSampleRate()}').paddingOnly(bottom: 5),
              ElevatedButton(
                      onPressed: startRecording,
                      style: ElevatedButton.styleFrom(
                        primary: bRecording ? Colors.blueGrey : Colors.green,
                        shadowColor: Colors.black54,
                      ),
                      child: const Text('Start recording'))
                  .paddingOnly(top: 10),

              ElevatedButton(
                  onPressed: pauseRecording,
                  style: ElevatedButton.styleFrom(
                    primary: Colors.blue,
                    shadowColor: Colors.black54,
                  ),
                  child: const Text('Pause recording'))
                  .paddingOnly(top: 10),

                ElevatedButton(
                  onPressed: resumeRecording,
                  style: ElevatedButton.styleFrom(
                    primary: Colors.green,
                    shadowColor: Colors.black54,
                  ),
                  child: const Text('Resume recording'))
                  .paddingOnly(top: 10),

              ElevatedButton(
                      onPressed: stopRecording,
                      style: ElevatedButton.styleFrom(
                        primary: bRecording ? Colors.red : Colors.blueGrey,
                        shadowColor: Colors.black54,
                      ),
                      child: const Text('Stop recording'))
                  .paddingOnly(bottom: 5),
              FutureBuilder(
                future: _getPath(),
                builder: (BuildContext context, AsyncSnapshot snapshot) {
                  if (snapshot.hasData) {
                    return Text('Recording will be saved  to: \n' + snapshot.data).paddingSymmetric(vertical: 10);
                  } else {
                    return Text("Loading").paddingSymmetric(vertical: 10);
                  }
                },
              ),
              Text(
                bFileExist ? 'Recording exist' : 'Recording not found',
                style: TextStyle(color: bFileExist ? Colors.green : Colors.red),
              ).paddingOnly(bottom: 5, top: 10),
              if (bFileExist)
                ElevatedButton(
                        onPressed: playAudio,
                        style: ElevatedButton.styleFrom(
                          primary: bRecording ? Colors.blueGrey : Colors.blue,
                          shadowColor: Colors.black54,
                        ),
                        child: const Text('Play audio'))
                    .paddingOnly(bottom: 5),

              if (bFileExist)
                ElevatedButton(
                    onPressed: pauseAudio,
                    style: ElevatedButton.styleFrom(
                      primary: bRecording ? Colors.blueGrey : Colors.blue,
                      shadowColor: Colors.black54,
                    ),
                    child: const Text('Pause audio'))
                    .paddingOnly(bottom: 5),

              if (bFileExist)
                ElevatedButton(
                    onPressed: resumeAudio,
                    style: ElevatedButton.styleFrom(
                      primary: bRecording ? Colors.blueGrey : Colors.blue,
                      shadowColor: Colors.black54,
                    ),
                    child: const Text('Resume audio'))
                    .paddingOnly(bottom: 5),

              if (bFileExist)
                ElevatedButton(
                    onPressed: stopAudio,
                    style: ElevatedButton.styleFrom(
                      primary: bRecording ? Colors.blueGrey : Colors.amber,
                      shadowColor: Colors.black54,
                    ),
                    child: const Text('Stop audio'))
                    .paddingOnly(bottom: 5),
            ],
          ),
        ),
      ),
    );
  }

  //Audio Recorder
  void startRecording() {
    if (!bRecording) {
      setState(() {
        flutterOboe.startRecording();
        bRecording = true;
      });
    }
  }

  void pauseRecording() {
    if (bRecording) {
      setState(() {
        flutterOboe.pauseRecording();
        bRecordingPaused = true;
      });
    }
  }

  void resumeRecording() {
    if (bRecordingPaused) {
      setState(() {
        flutterOboe.resumeRecording();
        bRecordingPaused = false;
      });
    }
  }

  void stopRecording() async {
    //Can stop the recording only when it's still recording
    if (bRecording) {
      setState(() {
        flutterOboe.stopRecording();
        flutterOboe.writeToFile(filePath);
        bRecording = false;
      });

      await checkFile();
    }
  }

  void saveRecording() {
    flutterOboe.writeToFile(filePath);
  }

  Future<void> checkFile() async {
    bFileExist = await File(filePath).exists();
    setState(() {
      if (bFileExist) {
        playAudio();
      }
    });
  }

  void playAudio() async {
    //Can play only when it's not recording
    if (!bRecording) {
      flutterOboe.playFromFile(filePath);
    }
  }

  void pauseAudio() async {
    //Can play only when it's not recording
    if (!bRecording) {
      flutterOboe.pausePlayingFromFile();
    }
  }

  void resumeAudio() async {
    //Can play only when it's not recording
    if (!bRecording) {
      flutterOboe.resumePlayingFromFile();
    }
  }

  void stopAudio() async {
    flutterOboe.stopPlayingFromFile();
  }
} 

Download Details:

Author: joydash

Source Code: https://bitbucket.org/joydash/flutter_oboe/src/master/

#flutter #audio #android 

Google Oboe Audio Engine Plugins for Flutter
Gordon  Taylor

Gordon Taylor

1649060460

TS-audio: An Agnostic & Easy-to-use Library to Work

ts-audio ·

ts-audio is an agnostic and easy-to-use library to work with the AudioContext API and create Playlists.

Features

  • Simple API that abstracts the complexity of the AudioContext API
  • Cross-browser support
  • Makes easy to create audio playlist
  • Works with any language that compiles to JavaScript
  • Supports to Types
  • Zero-dependecy

Installation

To install ts-audio, execute:

$ npm install ts-audio

or

$ yarn add ts-audio

Quickstart

ts-audio has two components at its core: Audio and AudioPlaylist. Both components are functions that you can call with certain parameters.

Below is an example of how to use the Audio:

import Audio from 'ts-audio';

const audio = Audio({
  file: './song.mp3',
  loop: true,
  volume: 0.2,
});

audio.play();

To use the AudioPlaylist component is also quite simple:

import { AudioPlaylist } from 'ts-audio';

const playlist = AudioPlaylist({
  files: ['./songOne.mp3', './songTwo.mp3', './songThree.mp3'],
  volume: 0.7,
});

playlist.play();

Docs

Audio

Author: EvandroLG
Source Code: https://github.com/EvandroLG/ts-audio 
License: MIT License

#typescript #api #audio 

TS-audio: An Agnostic & Easy-to-use Library to Work
Gordon  Murray

Gordon Murray

1647276060

Flutter Plugin That Can Support Audio Recording & Level Metering

flutter_audio_recorder 

English | 简体中文

Flutter Audio Record Plugin that supports Record Pause Resume Stop and provide access to audio level metering properties average power peak power

Works for both Android and iOS

Code Samples:

Installation

add flutter_audio_recorder to your pubspec.yaml

iOS Permission

  1. Add usage description to plist
<key>NSMicrophoneUsageDescription</key>
<string>Can We Use Your Microphone Please</string>
  1. Then use hasPermission api to ask user for permission when needed

Android Permission

  1. Add uses-permission to ./android/app/src/main/AndroidManifest.xml in xml root level like below
 ...
 </application>
 <uses-permission android:name="android.permission.RECORD_AUDIO"/>
 <uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE"/>
 ...
</manifest>
  1. Then use hasPermission api to ask user for permission when needed

Configuration

iOS Deployment Target is 8.0 above

Android

  • AndroidX: use latest version (0.5.x)
  • Legacy Android: use old version (0.4.9)

Usage

Recommended API Usage: hasPermission => init > start -> (pause <-> resume) * n -> stop, call init again before start another recording

Always check permission first(it will request permission if permission has not been set to true/false yet, otherwise it will return the result of recording permission)

bool hasPermission = await FlutterAudioRecorder.hasPermissions;

Initialize (run this before start, so we could check if file with given name already exists)

var recorder = FlutterAudioRecorder("file_path.mp4"); // .wav .aac .m4a
await recorder.initialized;

or

var recorder = FlutterAudioRecorder("file_path", audioFormat: AudioFormat.AAC); // or AudioFormat.WAV
await recorder.initialized;

Sample Rate

var recorder = FlutterAudioRecorder("file_path", audioFormat: AudioFormat.AAC, sampleRate: 22000); // sampleRate is 16000 by default
await recorder.initialized;

Audio Extension and Format Mapping

Audio FormatAudio Extension List
AAC.m4a .aac .mp4
WAV.wav

Start recording

await recorder.start();
var recording = await recorder.current(channel: 0);

Get recording details

var current = await recording.current(channel: 0);
// print(current.status);

You could use a timer to access details every 50ms(simply cancel the timer when recording is done)

new Timer.periodic(tick, (Timer t) async {
        var current = await recording.current(channel: 0);
        // print(current.status);
        setState(() {
        });
      });

Recording

NameDescription
pathString
extensionString
durationDuration
audioFormatAudioFormat
meteringAudioMetering
statusRecordingStatus

Recording.metering

NameDescription
peakPowerdouble
averagePowerdouble
isMeteringEnabledbool

Recording.status

Unset,Initialized,Recording,Paused,Stopped

Pause

await recorder.pause();

Resume

await recorder.resume();

Stop (after stop, run init again to create another recording)

var result = await recorder.stop();
File file = widget.localFileSystem.file(result.path);

Example

Please check example app using Xcode.

Getting Started

This project is a starting point for a Flutter plug-in package, a specialized package that includes platform-specific implementation code for Android and/or iOS.

For help getting started with Flutter, view our online documentation, which offers tutorials, samples, guidance on mobile development, and a full API reference.

Author: RMbrone
Source Code: https://github.com/rmbrone/flutter_audio_recorder 
License: MIT License

#flutter #dart #audio 

Flutter Plugin That Can Support Audio Recording & Level Metering

Malgo: Mini Audio Library

malgo

Go bindings for miniaudio library.

Requires cgo but does not require linking to anything on the Windows/macOS and it links only -ldl on Linux/BSDs.

Installation

go get -u github.com/gen2brain/malgo

Documentation

Documentation on GoDoc. Also check examples.

Platforms

  • Windows (WASAPI, DirectSound, WinMM)
  • Linux (PulseAudio, ALSA, JACK)
  • FreeBSD/NetBSD/OpenBSD (OSS/audio(4)/sndio)
  • macOS (CoreAudio)
  • Android (OpenSL|ES, AAudio)

Author: Gen2brain
Source Code: https://github.com/gen2brain/malgo 
License: Unlicense License

#golang #go #audio 

Malgo: Mini Audio Library

Gosamplerate: Libsamplerate Bindings for Go

libsamplerate binding for Golang

This is a Golang binding for libsamplerate (written in C), probably the best audio Sample Rate Converter available to today.

A classical use case is converting audio from a CD sample rate of 44.1kHz to the 48kHz sample rate used by DAT players.

libsamplerate is capable of arbitrary and time varying conversions (max sampling / upsampling by factor 256) and comes with 5 converters, allowing quality to be traded off against computation cost.

API implementations

gosamplerate implements the following libsamplerate API calls:

not (yet) implemented is:

License

This library (gosamplerate) is published under the the permissive BSD license. You can find a good comparison of Open Source Software licenses, including the BSD license at choosealicense.com

libsamplerate has been republished in 2016 under the 2-clause BSD license.

How to install samplerate

Make sure that you have libsamplerate installed on your system.

On Mac or Linux it can be installed conveniently through your distro's packet manager.

Linux:

using apt (Ubuntu), yum (Centos)...etc.

    $ sudo apt install libsamplerate0

MacOS

using Homebrew:

    $ brew install libsamplerate

Install gosamplerate

    $ go get github.com/dh1tw/gosamplerate

Documentation

The API of gosamplerate can be found at godoc.org. The documentation of libsamplerate (necessary in order to fully understand the API) can be found here.

Tests & Examples

The test coverage is close to 100%. The tests contain various examples on how to use gosamplerate.

Author: DH1tw
Source Code: https://github.com/dh1tw/gosamplerate 
License: BSD-2-Clause License

#golang #go #audio 

Gosamplerate: Libsamplerate Bindings for Go

GoAudio: Native Go Audio Processing Library

GoAudio 🎶

GoAudio is an audio processing library, currently supporting WAVE files, although some tools such as the synth and breakpoints are encoding-agnostic, so you could combine them with a different library for storing the data and using GoAudio only as a means to generate the waveforms.

Features

Blog

If you want to know more about how this code works and what you can do with it, I write about this code and other audio related programs over on my blog: dylanmeeus.github.io.

Author: DylanMeeus
Source Code: https://github.com/DylanMeeus/GoAudio 
License: MIT License

#go #golang #audio 

GoAudio: Native Go Audio Processing Library

Flac: Native Go FLAC Encoder/decoder with Support for FLAC Streams

flac

This package provides access to FLAC (Free Lossless Audio Codec) streams.

Documentation

Documentation provided by GoDoc.

  • flac: provides access to FLAC (Free Lossless Audio Codec) streams.
    • frame: implements access to FLAC audio frames.
    • meta: implements access to FLAC metadata blocks.

Changes

Version 1.0.7 (2021-01-28)

Version 1.0.6 (2019-12-20)

  • Add experimental Encoder API to encode audio samples and metadata blocks (see #32).
  • Use go.mod.
  • Skip ID3v2 data prepended to flac files when parsing (see 36cc17e).
  • Add 16kHz test case. Thanks to Chewxy.
  • Fix lint issues (see #25).

Version 1.0.5 (2016-05-06)

  • Simplify import paths. Drop use of gopkg.in, and rely on vendoring instead (see azul3d/engine#1).
  • Add FLAC decoding benchmark (see d675e0a)

Version 1.0.4 (2016-02-11)

  • Add API examples to documentation (see #11).
  • Extend test cases (see aadf80a).

Version 1.0.3 (2016-02-02)

  • Implement decoding of FLAC files with wasted bits-per-sample (see #12).
  • Stress test the library using go-fuzz (see #10). Thanks to Patrick Mézard.

Version 1.0.2 (2015-06-05)

Version 1.0.1 (2015-02-25)

  • Fix two subframe decoding bugs (see #7). Thanks to Jonathan MacMillan.
  • Add frame decoding test cases.

Version 1.0.0 (2014-09-30)

  • Initial release.
  • Implement decoding of FLAC files.

Author: Mewkiz
Source Code: https://github.com/mewkiz/flac 
License: Unlicense License

#golang #audio 

Flac: Native Go FLAC Encoder/decoder with Support for FLAC Streams

A Flutter Plugin That Delivers Audio Buffers for Real-time Processing

flutter-voice-processor

A Flutter plugin for real-time voice processing.

Usage

Create:

int frameLength = 512;
int sampleRate = 16000;
VoiceProcessor _voiceProcessor = VoiceProcessor.getVoiceProcessor(frameLength, sampleRate);
Function _removeListener = _voiceProcessor.addListener((buffer) {
    print("Listener received buffer of size ${buffer.length}!");
});

Start audio:

try {
    if (await _voiceProcessor.hasRecordAudioPermission()) {
        await _voiceProcessor.start();    
    } else {
        print("Recording permission not granted");
    }
} on PlatformException catch (ex) {
    print("Failed to start recorder: " + ex.toString());
}

Stop audio:

await _voiceProcessor.stop();
_removeListener();

Use this package as a library

Depend on it

Run this command:

With Flutter:

 $ flutter pub add flutter_voice_processor

This will add a line like this to your package's pubspec.yaml (and run an implicit flutter pub get):

dependencies:
  flutter_voice_processor: ^1.0.6

Alternatively, your editor might support flutter pub get. Check the docs for your editor to learn more.

Import it

Now in your Dart code, you can use:

import 'package:flutter_voice_processor/flutter_voice_processor.dart'; 

example/lib/main.dart

//
// Copyright 2020-2021 Picovoice Inc.
//
// You may not use this file except in compliance with the license. A copy of the license is located in the "LICENSE"
// file accompanying this source.
//
// Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
// an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
// specific language governing permissions and limitations under the License.
//

import 'package:flutter/material.dart';
import 'dart:async';
import 'package:flutter/services.dart';
import 'package:flutter_voice_processor/flutter_voice_processor.dart';

void main() {
  runApp(MyApp());
}

class MyApp extends StatefulWidget {
  @override
  _MyAppState createState() => _MyAppState();
}

class _MyAppState extends State<MyApp> {
  bool _isButtonDisabled = false;
  bool _isProcessing = false;
  VoiceProcessor? _voiceProcessor;
  Function? _removeListener;
  Function? _removeListener2;
  Function? _errorListener;

  @override
  void initState() {
    super.initState();
    _initVoiceProcessor();
  }

  void _initVoiceProcessor() async {
    _voiceProcessor = VoiceProcessor.getVoiceProcessor(512, 16000);
  }

  Future<void> _startProcessing() async {
    this.setState(() {
      _isButtonDisabled = true;
    });

    _removeListener = _voiceProcessor?.addListener(_onBufferReceived);
    _removeListener2 = _voiceProcessor?.addListener(_onBufferReceived2);
    _errorListener = _voiceProcessor?.addErrorListener(_onErrorReceived);
    try {
      if (await _voiceProcessor?.hasRecordAudioPermission() ?? false) {
        await _voiceProcessor?.start();
        this.setState(() {
          _isProcessing = true;
        });
      } else {
        print("Recording permission not granted");
      }
    } on PlatformException catch (ex) {
      print("Failed to start recorder: " + ex.toString());
    } finally {
      this.setState(() {
        _isButtonDisabled = false;
      });
    }
  }

  void _onBufferReceived(dynamic eventData) {
    print("Listener 1 received buffer of size ${eventData.length}!");
  }

  void _onBufferReceived2(dynamic eventData) {
    print("Listener 2 received buffer of size ${eventData.length}!");
  }

  void _onErrorReceived(dynamic eventData) {
    String errorMsg = eventData as String;
    print(errorMsg);
  }

  Future<void> _stopProcessing() async {
    this.setState(() {
      _isButtonDisabled = true;
    });

    await _voiceProcessor?.stop();
    _removeListener?.call();
    _removeListener2?.call();
    _errorListener?.call();

    this.setState(() {
      _isButtonDisabled = false;
      _isProcessing = false;
    });
  }

  void _toggleProcessing() async {
    if (_isProcessing) {
      await _stopProcessing();
    } else {
      await _startProcessing();
    }
  }

  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      home: Scaffold(
        appBar: AppBar(
          title: const Text('Voice Processor'),
        ),
        body: Center(
          child: _buildToggleProcessingButton(),
        ),
      ),
    );
  }

  Widget _buildToggleProcessingButton() {
    return new ElevatedButton(
      onPressed: _isButtonDisabled ? null : _toggleProcessing,
      child: Text(_isProcessing ? "Stop" : "Start",
          style: TextStyle(fontSize: 20)),
    );
  }
} 

Download Details:

Author: 

Source Code: https://pub.dev/packages/flutter_voice_processor

#flutter #audio 

A Flutter Plugin That Delivers Audio Buffers for Real-time Processing

Flutter Radio Plugin Handles A Single Streaming Audio Preciously

Flutter radio plugin handles a single streaming media preciously. This plugin was developed with maximum usage in mind. Flutter Radio player enables Streaming audio content on both Android and iOS natively, as an added feature this plugin supports background music play as well. This plugin also integrate deeply with both core media capabilities such as MediaSession on Android and RemoteControl capabilities (Control Center) on iOS. This plugin also support controlling the player via both wearOS and WatchOS.

Features

  • Supports both android and ios
  • Supports background music playing
  • Integrates well with watchOS and WearOS.
  • Handles network interruptions.
  • Reactive
  • Developer friendly (Logs are placed though out the codebase, so it's easy to trace a bug)

Reactivity ?

Unlike any other Music Playing plugin Flutter Radio Player is very reactive. It communicates with the native layer using Event and Streams, this making the plugin very reactive to both Application (Flutter) side and the native side.

Plugin events

This plugin utilises Android LocalBroadcaster and iOS Notification center for pushing out events. Names of the events are listed below.

  • flutter_radio_playing
  • flutter_radio_paused
  • flutter_radio_stopped
  • flutter_radio_error
  • flutter_radio_loading

Getting Started

  1. Add this to your package's pubspec.yaml file
dependencies:
  flutter_radio_player: ^1.X.X
  1. Install it
$ flutter pub get
  1. Import it
import 'package:flutter_radio_player/flutter_radio_player.dart';
  1. Configure it Creat a new instance of the player. An FlutterRadioPlayer instance can play a single audio stream at a time. To create it, simply call the constructor. However DO NOT make multiple instances of the service as FRP is using a FOREGROUND SERVICE to keep itself live when you minimize the application in Android.
FlutterRadioPlayer _flutterRadioPlayer = new FlutterRadioPlayer();

When you have an FRP instance you may simply call the init method to invoke the platform specific player preparation. For the API please reffer FRP API.

await _flutterRadioPlayer.init("Flutter Radio Example", "Live", "URL_HERE", "true");

After player preparation you may simply call playOrPause method to toggle audio stream.

await _flutterRadioPlayer.playOrPause();

FRP does allow you to change the URL after player initialized. You can simply change the stream url by calling setUrl on FRP object.

await _flutterRadioPlayer.setUrl('URL_HERE', "false");

calling above method will cause the existing URL to pause and play the newly set URL. Please refer the FRP API for api documentation.

Besides above mentioned method, below are the methods that FRP exposes.

  • stop() - Will stop all the streaming audio streams and detaches itself from FOREGROUND SERVICE. You need to reinitialize to use the plugin again,
await _flutterRadioPlayer.stop()
  • start() - Will start the audio stream using the initialized object.
await _flutterRadioPlayer.start()
  • pause() - Will pause the audio stream using the initialized object.
await _flutterRadioPlayer.pause()

Now that's not all. This plugin handles almost everything for you when it comes to playing a single stream of audio. From Player meta details to network interruptions, FRP handles it all with a sweat. Please refer the example to get an idea about what FRP can do.

iOS and Android Support

If the plugin is failing to initiate, kindly make sure your permission for background processes are given for your application

For your Android application you might want to add permissions in AndroidManifest.xml. This is already added for in the library level.

    <!--  Permissions for the plugin  -->
    <uses-permission android:name="android.permission.FOREGROUND_SERVICE" />
    <uses-permission android:name="android.permission.INTERNET" />

    <!--  Services for the plugin  -->
    <application android:usesCleartextTraffic="true">
        <service android:name=".core.StreamingCore"/>
    </application>

For your iOS application you need to enable it like this

xcode image

Support

Please hit a like to plugin on pub if you used it and love it. put a ⭐️ my GitHub repo and show me some ♥️ so i can keep working on this.

Found a bug ?

Please feel free to throw in a pull request. Any support is warmly welcome.

Use this package as a library

Depend on it

Run this command:

With Flutter:

 $ flutter pub add flutter_radio_player

This will add a line like this to your package's pubspec.yaml (and run an implicit flutter pub get):

dependencies:
  flutter_radio_player: ^1.1.0

Alternatively, your editor might support flutter pub get. Check the docs for your editor to learn more.

Import it

Now in your Dart code, you can use:

import 'package:flutter_radio_player/flutter_radio_player.dart'; 

example/lib/main.dart

import 'dart:async';

import 'package:flutter/material.dart';
import 'package:flutter/services.dart';
import 'package:flutter_radio_player/flutter_radio_player.dart';

void main() => runApp(MyApp());

class MyApp extends StatefulWidget {
  final playerState = FlutterRadioPlayer.flutter_radio_paused;

  @override
  _MyAppState createState() => _MyAppState();
}

class _MyAppState extends State<MyApp> {
  int _currentIndex = 0;
  double volume = 0.8;
  FlutterRadioPlayer _flutterRadioPlayer = new FlutterRadioPlayer();

  @override
  void initState() {
    super.initState();
    initRadioService();
  }

  Future<void> initRadioService() async {
    try {
      await _flutterRadioPlayer.init(
        "Flutter Radio Example",
        "Live",
        "http://209.133.216.3:7018/stream?type=http&nocache=1906",
        "false",
      );
    } on PlatformException {
      print("Exception occurred while trying to register the services.");
    }
  }

  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      home: Scaffold(
        appBar: AppBar(
          title: const Text('Flutter Radio Player Example'),
        ),
        body: Center(
          child: Column(
            children: <Widget>[
              StreamBuilder(
                stream: _flutterRadioPlayer.isPlayingStream,
                initialData: widget.playerState,
                builder:
                    (BuildContext context, AsyncSnapshot<String> snapshot) {
                  String returnData = snapshot.data;
                  print("object data: " + returnData);
                  switch (returnData) {
                    case FlutterRadioPlayer.flutter_radio_stopped:
                      return ElevatedButton(
                        style: ElevatedButton.styleFrom(),
                        child: Text("Start listening now"),
                        onPressed: () async {
                          await initRadioService();
                        },
                      );
                      break;
                    case FlutterRadioPlayer.flutter_radio_loading:
                      return Text("Loading stream...");
                    case FlutterRadioPlayer.flutter_radio_error:
                      return ElevatedButton(
                        style: ElevatedButton.styleFrom(),
                        child: Text("Retry ?"),
                        onPressed: () async {
                          await initRadioService();
                        },
                      );
                      break;
                    default:
                      return Row(
                        crossAxisAlignment: CrossAxisAlignment.center,
                        mainAxisAlignment: MainAxisAlignment.center,
                        children: <Widget>[
                          IconButton(
                            onPressed: () async {
                              print("button press data: " +
                                  snapshot.data.toString());
                              await _flutterRadioPlayer.playOrPause();
                            },
                            icon: snapshot.data ==
                                    FlutterRadioPlayer.flutter_radio_playing
                                ? Icon(Icons.pause)
                                : Icon(Icons.play_arrow),
                          ),
                          IconButton(
                            onPressed: () async {
                              await _flutterRadioPlayer.stop();
                            },
                            icon: Icon(Icons.stop),
                          )
                        ],
                      );
                      break;
                  }
                },
              ),
              Slider(
                value: volume,
                min: 0,
                max: 1.0,
                onChanged: (value) => setState(
                  () {
                    volume = value;
                    _flutterRadioPlayer.setVolume(volume);
                  },
                ),
              ),
              Text(
                "Volume: " + (volume * 100).toStringAsFixed(0),
              ),
              SizedBox(
                height: 15,
              ),
              Text("Metadata Track "),
              StreamBuilder<String>(
                initialData: "",
                stream: _flutterRadioPlayer.metaDataStream,
                builder: (context, snapshot) {
                  return Text(snapshot.data);
                },
              ),
              ElevatedButton(
                style: ElevatedButton.styleFrom(),
                child: Text("Change URL"),
                onPressed: () async {
                  _flutterRadioPlayer.setUrl(
                    "http://209.133.216.3:7018/;stream.mp3",
                    "false",
                  );
                },
              )
            ],
          ),
        ),
        bottomNavigationBar: new BottomNavigationBar(
          currentIndex: this._currentIndex,
          onTap: (int index) {
            setState(() {
              _currentIndex = index;
            });
          },
          items: [
            BottomNavigationBarItem(
              icon: new Icon(Icons.home),
              label: "Home",
            ),
            BottomNavigationBarItem(
              icon: new Icon(Icons.pages),
              label: "Second Page",
            )
          ],
        ),
      ),
    );
  }
}

Download Details:

Author: Sithira

Source Code: https://github.com/Sithira/FlutterRadioPlayer

#flutter #audio #radio 

Flutter Radio Plugin Handles A Single Streaming Audio Preciously
Reid  Rohan

Reid Rohan

1644433860

SoundJS: A Library to Make Working with Audio on The Web Easier

SoundJS

SoundJS is a library to make working with audio on the web easier. It provides a consistent API for playing audio in different browsers, including using a target plugin model to provide an easy way to provide additional audio plugins like a Flash fallback (included, but must be used separately from the combined/minified version).

A mechanism has been provided for easily tying in audio preloading to PreloadJS.

Example

createjs.Sound.on("fileload", handleLoadComplete);
createjs.Sound.alternateExtensions = ["mp3"];
createjs.Sound.registerSound({src:"path/to/sound.ogg", id:"sound"});
function handleLoadComplete(event) {
    createjs.Sound.play("sound");
}

License

Built by gskinner.com, and released for free under the MIT license, which means you can use it for almost any purpose (including commercial projects). We appreciate credit where possible, but it is not a requirement.

Support and Resources

Classes

Sound

The core API for playing sounds. Call createjs.Sound.play(sound, ...options), and a sound instance is created that can be used to control the audio, and dispatches events when it is complete, loops, or is interrupted.

SoundInstance

A controllable sound object that wraps the actual plugin implementation, providing a consistent API for audio playback, no matter what happens in the background. Sound instances can be paused, muted, and stopped; and the volume, pan (where available), and position changed using the simple API.

WebAudioPlugin

The default, built-in plugin, which uses Web Audio APIs to playback sounds. Note that WebAudio will fail to load when run locally, and the HTML audio plugin will be used instead.

HTMLAudioPlugin

The fallback built-in plugin, which manages audio playback via the HTML5  tag. This will be used in instances where the WebAudio plugin is not available.

CordovaAudioPlugin

An additional plugin which will playback audio in a Cordova app and tools that utilize Cordova such as PhoneGap or Ionic. You must manually register this plugin. Currently available on github since SoundJS-0.6.1.

FlashAudioPlugin

An additional plugin which uses a flash shim (and SWFObject) to playback audio using Flash. You must manually set up and register this plugin.

Documentation and examples

Have a look at the included examples and API documentation for more in-depth information.

Author: CreateJS
Source Code: https://github.com/CreateJS/SoundJS 
License: MIT License

#javascript #audio #html5 

SoundJS: A Library to Make Working with Audio on The Web Easier
Reid  Rohan

Reid Rohan

1644407580

Html5media: Enables <video> and <audio> Tags in All Major Browsers

HTML5 video made easy

All it takes is a single line of code to make HTML5 video and audio tags work in all major browsers.

How to enable video and audio tags in all major browsers

To make HTML5 video and audio tags work in all major browsers, simply add the following line of code somewhere in the <head> of your document.

<script src="http://api.html5media.info/1.1.8/html5media.min.js"></script>

That's it! There is no second step!

How to embed video

You can embed video into your page using the following code.

<video src="video.mp4" width="320" height="200" controls preload></video>

For more information and troubleshooting, please visit the video wiki page.

How to embed audio

You can embed audio into your page using the following code.

<audio src="audio.mp3" controls preload></audio>

For more information and troubleshooting, please visit the audio wiki page.

Why use html5media?

HTML5 video and audio tags were designed to make embedding a video as easy as embedding an image. They were also designed to give users a faster experience by doing away with browser plugins such as Adobe Flash.

Unfortunately, older browsers don't support HTML5 video and audio tags, and even modern browsers don't support a consistent set of video codecs, making embedding a video rather difficult.

The html5media project makes embedding video or audio as easy as it was meant to be. It's a fire-and-forget solution, and doesn't require installing any files on your server. Unlike many other HTML5 video players, it allows people to use the video controls supplied by their own web browser. It's one of the smallest, fastest solutions available, and as browser technology improves it will become even faster.

More information

The html5media project is open source and can be found on GitHub. You can find out more information on the html5media wiki, or the main html5media project page.

About the author

Dave Hall is a freelance web developer, based in Cambridge, UK. You can usually find him on the Internet in a number of different places:

Extra credits

The html5media project bundles together a number of excellent open-source and creative-commons projects. They are listed below.

Author: Etianen
Source Code: https://github.com/etianen/html5media 
License: GPL-3.0 License

#javascript #video #audio 

Html5media: Enables <video> and <audio> Tags in All Major Browsers
Dexter  Goodwin

Dexter Goodwin

1642206600

A Python Package for Time Series Augmentation

tsaug

tsaug is a Python package for time series augmentation. It offers a set of augmentation methods for time series, as well as a simple API to connect multiple augmenters into a pipeline.

See https://tsaug.readthedocs.io complete documentation.

Installation

Prerequisites: Python 3.5 or later.

It is recommended to install the most recent stable release of tsaug from PyPI.

pip install tsaug

Alternatively, you could install from source code. This will give you the latest, but unstable, version of tsaug.

git clone https://github.com/arundo/tsaug.git
cd tsaug/
git checkout develop
pip install ./

Examples

A first-time user may start with two examples:

Examples of every individual augmenter can be found here

For full references of implemented augmentation methods, please refer to References.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

Please see Contributing for more details.

Author: Arundo
Source Code: https://github.com/arundo/tsaug 
License: Apache-2.0 License

#python #audio #deep-learning #time #data 

A Python Package for Time Series Augmentation

Audio fingerprinting and recognition in Python

dejavu

Audio fingerprinting and recognition algorithm implemented in Python, see the explanation here:
How it works

Dejavu can memorize audio by listening to it once and fingerprinting it. Then by playing a song and recording microphone input or reading from disk, Dejavu attempts to match the audio against the fingerprints held in the database, returning the song being played.

Note: for voice recognition, Dejavu is not the right tool! Dejavu excels at recognition of exact signals with reasonable amounts of noise.

Quickstart with Docker

First, install Docker.

# build and then run our containers
$ docker-compose build
$ docker-compose up -d

# get a shell inside the container
$ docker-compose run python /bin/bash
Starting dejavu_db_1 ... done
root@f9ea95ce5cea:/code# python example_docker_postgres.py 
Fingerprinting channel 1/2 for test/woodward_43s.wav
Fingerprinting channel 1/2 for test/sean_secs.wav
...

# connect to the database and poke around
root@f9ea95ce5cea:/code# psql -h db -U postgres dejavu
Password for user postgres:  # type "password", as specified in the docker-compose.yml !
psql (11.7 (Debian 11.7-0+deb10u1), server 10.7)
Type "help" for help.

dejavu=# \dt
            List of relations
 Schema |     Name     | Type  |  Owner   
--------+--------------+-------+----------
 public | fingerprints | table | postgres
 public | songs        | table | postgres
(2 rows)

dejavu=# select * from fingerprints limit 5;
          hash          | song_id | offset |        date_created        |       date_modified        
------------------------+---------+--------+----------------------------+----------------------------
 \x71ffcb900d06fe642a18 |       1 |    137 | 2020-06-03 05:14:19.400153 | 2020-06-03 05:14:19.400153
 \xf731d792977330e6cc9f |       1 |    148 | 2020-06-03 05:14:19.400153 | 2020-06-03 05:14:19.400153
 \x71ff24aaeeb55d7b60c4 |       1 |    146 | 2020-06-03 05:14:19.400153 | 2020-06-03 05:14:19.400153
 \x29349c79b317d45a45a8 |       1 |    101 | 2020-06-03 05:14:19.400153 | 2020-06-03 05:14:19.400153
 \x5a052144e67d2248ccf4 |       1 |    123 | 2020-06-03 05:14:19.400153 | 2020-06-03 05:14:19.400153
(10 rows)

# then to shut it all down...
$ docker-compose down

If you want to be able to use the microphone with the Docker container, you'll need to do a little extra work. I haven't had the time to write this up, but if anyone wants to make a PR, I'll happily merge.

Docker alternative on local machine

Follow instructions in INSTALLATION.md

Next, you'll need to create a MySQL database where Dejavu can store fingerprints. For example, on your local setup:

$ mysql -u root -p
Enter password: **********
mysql> CREATE DATABASE IF NOT EXISTS dejavu;

Now you're ready to start fingerprinting your audio collection!

You may also use Postgres, of course. The same method applies.

Fingerprinting

Let's say we want to fingerprint all of July 2013's VA US Top 40 hits.

Start by creating a Dejavu object with your configurations settings (Dejavu takes an ordinary Python dictionary for the settings).

>>> from dejavu import Dejavu
>>> config = {
...     "database": {
...         "host": "127.0.0.1",
...         "user": "root",
...         "password": <password above>, 
...         "database": <name of the database you created above>,
...     }
... }
>>> djv = Dejavu(config)

Next, give the fingerprint_directory method three arguments:

  • input directory to look for audio files
  • audio extensions to look for in the input directory
  • number of processes (optional)
>>> djv.fingerprint_directory("va_us_top_40/mp3", [".mp3"], 3)

For a large amount of files, this will take a while. However, Dejavu is robust enough you can kill and restart without affecting progress: Dejavu remembers which songs it fingerprinted and converted and which it didn't, and so won't repeat itself.

You'll have a lot of fingerprints once it completes a large folder of mp3s:

>>> print djv.db.get_num_fingerprints()
5442376

Also, any subsequent calls to fingerprint_file or fingerprint_directory will fingerprint and add those songs to the database as well. It's meant to simulate a system where as new songs are released, they are fingerprinted and added to the database seemlessly without stopping the system.

Configuration options

The configuration object to the Dejavu constructor must be a dictionary.

The following keys are mandatory:

  • database, with a value as a dictionary with keys that the database you are using will accept. For example with MySQL, the keys must can be anything that the MySQLdb.connect() function will accept.

The following keys are optional:

  • fingerprint_limit: allows you to control how many seconds of each audio file to fingerprint. Leaving out this key, or alternatively using -1 and None will cause Dejavu to fingerprint the entire audio file. Default value is None.
  • database_type: mysql (the default value) and postgres are supported. If you'd like to add another subclass for BaseDatabase and implement a new type of database, please fork and send a pull request!

An example configuration is as follows:

>>> from dejavu import Dejavu
>>> config = {
...     "database": {
...         "host": "127.0.0.1",
...         "user": "root",
...         "password": "Password123", 
...         "database": "dejavu_db",
...     },
...     "database_type" : "mysql",
...     "fingerprint_limit" : 10
... }
>>> djv = Dejavu(config)

Tuning

Inside config/settings.py, you may want to adjust following parameters (some values are given below).

FINGERPRINT_REDUCTION = 30
PEAK_SORT = False
DEFAULT_OVERLAP_RATIO = 0.4
DEFAULT_FAN_VALUE = 5
DEFAULT_AMP_MIN = 10
PEAK_NEIGHBORHOOD_SIZE = 10

These parameters are described within the file in detail. Read that in-order to understand the impact of changing these values.

Recognizing

There are two ways to recognize audio using Dejavu. You can recognize by reading and processing files on disk, or through your computer's microphone.

Recognizing: On Disk

Through the terminal:

$ python dejavu.py --recognize file sometrack.wav 
{'total_time': 2.863781690597534, 'fingerprint_time': 2.4306554794311523, 'query_time': 0.4067542552947998, 'align_time': 0.007731199264526367, 'results': [{'song_id': 1, 'song_name': 'Taylor Swift - Shake It Off', 'input_total_hashes': 76168, 'fingerprinted_hashes_in_db': 4919, 'hashes_matched_in_input': 794, 'input_confidence': 0.01, 'fingerprinted_confidence': 0.16, 'offset': -924, 'offset_seconds': -30.00018, 'file_sha1': b'3DC269DF7B8DB9B30D2604DA80783155912593E8'}, {...}, ...]}

or in scripting, assuming you've already instantiated a Dejavu object:

>>> from dejavu.logic.recognizer.file_recognizer import FileRecognizer
>>> song = djv.recognize(FileRecognizer, "va_us_top_40/wav/Mirrors - Justin Timberlake.wav")

Recognizing: Through a Microphone

With scripting:

>>> from dejavu.logic.recognizer.microphone_recognizer import MicrophoneRecognizer
>>> song = djv.recognize(MicrophoneRecognizer, seconds=10) # Defaults to 10 seconds.

and with the command line script, you specify the number of seconds to listen:

$ python dejavu.py --recognize mic 10

Testing

Testing out different parameterizations of the fingerprinting algorithm is often useful as the corpus becomes larger and larger, and inevitable tradeoffs between speed and accuracy come into play.

Confidence

Test your Dejavu settings on a corpus of audio files on a number of different metrics:

  • Confidence of match (number fingerprints aligned)
  • Offset matching accuracy
  • Song matching accuracy
  • Time to match

Accuracy

An example script is given in test_dejavu.sh, shown below:

#####################################
### Dejavu example testing script ###
#####################################

###########
# Clear out previous results
rm -rf ./results ./temp_audio

###########
# Fingerprint files of extension mp3 in the ./mp3 folder
python dejavu.py --fingerprint ./mp3/ mp3

##########
# Run a test suite on the ./mp3 folder by extracting 1, 2, 3, 4, and 5 
# second clips sampled randomly from within each song 8 seconds 
# away from start or end, sampling offset with random seed = 42, and finally, 
# store results in ./results and log to ./results/dejavu-test.log
python run_tests.py \
    --secs 5 \
    --temp ./temp_audio \
    --log-file ./results/dejavu-test.log \
    --padding 8 \
    --seed 42 \
    --results ./results \
    ./mp3

The testing scripts are as of now are a bit rough, and could certainly use some love and attention if you're interested in submitting a PR! For example, underscores in audio filenames currently breaks the test scripts.

How does it work?

The algorithm works off a fingerprint based system, much like:

The "fingerprints" are locality sensitive hashes that are computed from the spectrogram of the audio. This is done by taking the FFT of the signal over overlapping windows of the song and identifying peaks. A very robust peak finding algorithm is needed, otherwise you'll have a terrible signal to noise ratio.

Here I've taken the spectrogram over the first few seconds of "Blurred Lines". The spectrogram is a 2D plot and shows amplitude as a function of time (a particular window, actually) and frequency, binned logrithmically, just as the human ear percieves it. In the plot below you can see where local maxima occur in the amplitude space:

Spectrogram

Finding these local maxima is a combination of a high pass filter (a threshold in amplitude space) and some image processing techniques to find maxima. A concept of a "neighboorhood" is needed - a local maxima with only its directly adjacent pixels is a poor peak - one that will not survive the noise of coming through speakers and through a microphone.

If we zoom in even closer, we can begin to imagine how to bin and discretize these peaks. Finding the peaks itself is the most computationally intensive part, but it's not the end. Peaks are combined using their discrete time and frequency bins to create a unique hash for that particular moment in the song - creating a fingerprint.

Spectgram zoomed

For a more detailed look at the making of Dejavu, see my blog post here.

How well it works

To truly get the benefit of an audio fingerprinting system, it can't take a long time to fingerprint. It's a bad user experience, and furthermore, a user may only decide to try to match the song with only a few precious seconds of audio left before the radio station goes to a commercial break.

To test Dejavu's speed and accuracy, I fingerprinted a list of 45 songs from the US VA Top 40 from July 2013 (I know, their counting is off somewhere). I tested in three ways:

  1. Reading from disk the raw mp3 -> wav data, and
  2. Playing the song over the speakers with Dejavu listening on the laptop microphone.
  3. Compressed streamed music played on my iPhone

Below are the results.

1. Reading from Disk

Reading from disk was an overwhelming 100% recall - no mistakes were made over the 45 songs I fingerprinted. Since Dejavu gets all of the samples from the song (without noise), it would be nasty surprise if reading the same file from disk didn't work every time!

2. Audio over laptop microphone

Here I wrote a script to randomly chose n seconds of audio from the original mp3 file to play and have Dejavu listen over the microphone. To be fair I only allowed segments of audio that were more than 10 seconds from the starting/ending of the track to avoid listening to silence.

Additionally my friend was even talking and I was humming along a bit during the whole process, just to throw in some noise.

Here are the results for different values of listening time (n):

Matching time

This is pretty rad. For the percentages:

Number of SecondsNumber CorrectPercentage Accuracy
127 / 4560.0%
243 / 4595.6%
344 / 4597.8%
444 / 4597.8%
545 / 45100.0%
645 / 45100.0%

Even with only a single second, randomly chosen from anywhere in the song, Dejavu is getting 60%! One extra second to 2 seconds get us to around 96%, while getting perfect only took 5 seconds or more. Honestly when I was testing this myself, I found Dejavu beat me - listening to only 1-2 seconds of a song out of context to identify is pretty hard. I had even been listening to these same songs for two days straight while debugging...

In conclusion, Dejavu works amazingly well, even with next to nothing to work with.

3. Compressed streamed music played on my iPhone

Just to try it out, I tried playing music from my Spotify account (160 kbit/s compressed) through my iPhone's speakers with Dejavu again listening on my MacBook mic. I saw no degredation in performance; 1-2 seconds was enough to recognize any of the songs.

Performance

Speed

On my MacBook Pro, matching was done at 3x listening speed with a small constant overhead. To test, I tried different recording times and plotted the recording time plus the time to match. Since the speed is mostly invariant of the particular song and more dependent on the length of the spectrogram created, I tested on a single song, "Get Lucky" by Daft Punk:

Matching time

As you can see, the relationship is quite linear. The line you see is a least-squares linear regression fit to the data, with the corresponding line equation:

1.364757 * record_time - 0.034373 = time_to_match

Notice of course since the matching itself is single threaded, the matching time includes the recording time. This makes sense with the 3x speed in purely matching, as:

1 (recording) + 1/3 (matching) = 4/3 ~= 1.364757

if we disregard the miniscule constant term.

The overhead of peak finding is the bottleneck - I experimented with multithreading and realtime matching, and alas, it wasn't meant to be in Python. An equivalent Java or C/C++ implementation would most likely have little trouble keeping up, applying FFT and peakfinding in realtime.

An important caveat is of course, the round trip time (RTT) for making matches. Since my MySQL instance was local, I didn't have to deal with the latency penalty of transfering fingerprint matches over the air. This would add RTT to the constant term in the overall calculation, but would not effect the matching process.

Storage

For the 45 songs I fingerprinted, the database used 377 MB of space for 5.4 million fingerprints. In comparison, the disk usage is given below:

Audio Information TypeStorage in MB
mp3339
wav1885
fingerprints377

There's a pretty direct trade-off between the necessary record time and the amount of storage needed. Adjusting the amplitude threshold for peaks and the fan value for fingerprinting will add more fingerprints and bolster the accuracy at the expense of more space.

Author: Worldveil
Source Code: https://github.com/worldveil/dejavu 
License: MIT License

#python #audio 

Audio fingerprinting and recognition in Python

A library provides common speech features for ASR including MFCCs

python_speech_features

This library provides common speech features for ASR including MFCCs and filterbank energies. If you are not sure what MFCCs are, and would like to know more have a look at this MFCC tutorial

Project Documentation

To cite, please use: James Lyons et al. (2020, January 14). jameslyons/python_speech_features: release v0.6.1 (Version 0.6.1). Zenodo. http://doi.org/10.5281/zenodo.3607820

Installation

This project is on pypi

To install from pypi:

pip install python_speech_features

From this repository:

git clone https://github.com/jameslyons/python_speech_features
python setup.py develop

Usage

Supported features:

  • Mel Frequency Cepstral Coefficients
  • Filterbank Energies
  • Log Filterbank Energies
  • Spectral Subband Centroids

Example use

From here you can write the features to a file etc.

MFCC Features

The default parameters should work fairly well for most cases, if you want to change the MFCC parameters, the following parameters are supported:

python
def mfcc(signal,samplerate=16000,winlen=0.025,winstep=0.01,numcep=13,
                 nfilt=26,nfft=512,lowfreq=0,highfreq=None,preemph=0.97,
     ceplifter=22,appendEnergy=True)
ParameterDescription
signalthe audio signal from which to compute features. Should be an N*1 array
sampleratethe samplerate of the signal we are working with.
winlenthe length of the analysis window in seconds. Default is 0.025s (25 milliseconds)
winstepthe step between successive windows in seconds. Default is 0.01s (10 milliseconds)
numcepthe number of cepstrum to return, default 13
nfiltthe number of filters in the filterbank, default 26.
nfftthe FFT size. Default is 512
lowfreqlowest band edge of mel filters. In Hz, default is 0
highfreqhighest band edge of mel filters. In Hz, default is samplerate/2
preemphapply preemphasis filter with preemph as coefficient. 0 is no filter. Default is 0.97
ceplifterapply a lifter to final cepstral coefficients. 0 is no lifter. Default is 22
appendEnergyif this is true, the zeroth cepstral coefficient is replaced with the log of the total frame energy.
returnsA numpy array of size (NUMFRAMES by numcep) containing features. Each row holds 1 feature vector.

Filterbank Features

These filters are raw filterbank energies. For most applications you will want the logarithm of these features. The default parameters should work fairly well for most cases. If you want to change the fbank parameters, the following parameters are supported:

python
def fbank(signal,samplerate=16000,winlen=0.025,winstep=0.01,
      nfilt=26,nfft=512,lowfreq=0,highfreq=None,preemph=0.97)
ParameterDescription
signalthe audio signal from which to compute features. Should be an N*1 array
sampleratethe samplerate of the signal we are working with
winlenthe length of the analysis window in seconds. Default is 0.025s (25 milliseconds)
winstepthe step between successive windows in seconds. Default is 0.01s (10 milliseconds)
nfiltthe number of filters in the filterbank, default 26.
nfftthe FFT size. Default is 512.
lowfreqlowest band edge of mel filters. In Hz, default is 0
highfreqhighest band edge of mel filters. In Hz, default is samplerate/2
preemphapply preemphasis filter with preemph as coefficient. 0 is no filter. Default is 0.97
returnsA numpy array of size (NUMFRAMES by nfilt) containing features. Each row holds 1 feature vector. The second return value is the energy in each frame (total energy, unwindowed)

Reference

sample english.wav obtained from:

wget http://voyager.jpl.nasa.gov/spacecraft/audio/english.au
sox english.au -e signed-integer english.wav

Author: Jameslyons
Source Code: https://github.com/jameslyons/python_speech_features 
License: MIT License

#python #audio 

A library provides common speech features for ASR including MFCCs