Node.js v14.18.0 LTS has been released!

Node.js® is a JavaScript runtime built on Chrome's V8 JavaScript engine.

Notable Changes

  • [3a60de0135] - assert: change status of legacy asserts (James M Snell) #38113
  • [df37c106a7] - (SEMVER-MINOR) buffer: introduce Blob (James M Snell) #36811
  • [223494c548] - (SEMVER-MINOR) buffer: add base64url encoding option (Filip Skokan) #36952
  • [14fc4ddabc] - (SEMVER-MINOR) child_process: allow options.cwd receive a URL (Khaidi Chu) #38862
  • [b68b13acb3] - (SEMVER-MINOR) child_process: add timeout to spawn and fork (Nitzan Uziely) #37256
  • [da98c9f99b] - (SEMVER-MINOR) child_process: allow promisified exec to be cancel (Carlos Fuentes) #34249
  • [779310ac87] - (SEMVER-MINOR) child_process: add 'overlapped' stdio flag (Thiago Padilha) #29412
  • [40eb3b79f1] - (SEMVER-MINOR) cli: add -C alias for --conditions flag (Guy Bedford) #38755
  • [39eba0a2e1] - (SEMVER-MINOR) cli: add --node-memory-debug option (Anna Henningsen) #35537
  • [d8d9a9628a] - (SEMVER-MINOR) dns: add "tries" option to Resolve options (Luan Devecchi) #39610
  • [15ba19b020] - (SEMVER-MINOR) dns: allow --dns-result-order to change default dns verbatim (Ouyang Yadong) #38099
  • [307c1d817f] - doc: refactor fs docs structure (James M Snell) #37170
  • [9ee3f77e32] - (SEMVER-MINOR) errors: remove experimental from --enable-source-maps (Benjamin Coe) #37362
  • [e73bfed2f4] - esm: deprecate legacy main lookup for modules (Guy Bedford) #36918
  • [989c204a58] - (SEMVER-MINOR) fs: allow empty string for temp directory prefix (Voltrex) #39028
  • [ef72490cde] - (SEMVER-MINOR) fs: allow no-params fsPromises fileHandle read (Nitzan Uziely) #38287
  • [cad9d20f64] - (SEMVER-MINOR) fs: add support for async iterators to fsPromises.writeFile (HiroyukiYagihashi) #37490
  • [2b0e2706c0] - fs: improve fsPromises readFile performance (Nitzan Uziely) #37608
  • [fe12cc07b3] - (SEMVER-MINOR) fs: add fsPromises.watch() (James M Snell) #37179
  • [2459c115a8] - (SEMVER-MINOR) fs: allow position parameter to be a BigInt in read and readSync (Darshan Sen) #36190
  • [6544cfb4b9] - (SEMVER-MINOR) http2: add support for sensitive headers (Anna Henningsen) #34145
  • [a6c6cbb4e6] - (SEMVER-MINOR) http2: allow setting the local window size of a session (Yongsheng Zhang) #35978
  • [1e5aca550c] - inspector: mark as stable (Gireesh Punathil) #37748
  • [93af04afbb] - (SEMVER-MINOR) module: add support for URL to import.meta.resolve (Antoine du Hamel) #38587
  • [f9f9389d83] - (SEMVER-MINOR) module: add support for node:‑prefixed require(…) calls (ExE Boss) #37246
  • [87c71065eb] - (SEMVER-MINOR) net: introduce net.BlockList (James M Snell) #34625
  • [b421d99a48] - (SEMVER-MINOR) node-api: allow retrieval of add-on file name (Gabriel Schulhof) #37195
  • [6a4811df8a] - (SEMVER-MINOR) os: add os.devNull (Luigi Pinca) #38569
  • [4a88ddeeca] - (SEMVER-MINOR) perf_hooks: introduce createHistogram (James M Snell) #37155
  • [1a6bf1c4a3] - (SEMVER-MINOR) process: add api to enable source-maps programmatically (legendecas) #39085
  • [99735a6fe8] - (SEMVER-MINOR) process: add 'worker' event (James M Snell) #38659
  • [3982919317] - (SEMVER-MINOR) process: add direct access to rss without iterating pages (Adrien Maret) #34291
  • [526e6c7bde] - (SEMVER-MINOR) readline: add AbortSignal support to interface (Nitzan Uziely) #37932
  • [e6eee08692] - (SEMVER-MINOR) readline: add support for the AbortController to the question method (Mattias Runge-Broberg) #33676
  • [32de361d70] - (SEMVER-MINOR) readline: add history event and option to set initial history (Mattias Runge-Broberg) #33662
  • [797f7f8a38] - (SEMVER-MINOR) repl: add auto‑completion for node:‑prefixed require(…) calls (ExE Boss) #37246
  • [abfd71b64c] - (SEMVER-MINOR) src: call overload ctor from the original ctor (Darshan Sen) #39768
  • [1efae01b18] - (SEMVER-MINOR) src: add a constructor overload for CallbackScope (Darshan Sen) #39768
  • [f7933804ba] - (SEMVER-MINOR) src: allow to negate boolean CLI flags (Michaël Zasso) #39023
  • [6d06ac2202] - (SEMVER-MINOR) src: add --heapsnapshot-near-heap-limit option (Joyee Cheung) #33010
  • [577d228ca0] - (SEMVER-MINOR) src: add way to get IsolateData and allocator from Environment (Anna Henningsen) #36441
  • [658a266cd4] - (SEMVER-MINOR) src: allow preventing SetPrepareStackTraceCallback (Shelley Vohr) #36447
  • [f421422ea4] - (SEMVER-MINOR) src: add maybe versions of EmitExit and EmitBeforeExit (Anna Henningsen) #35486
  • [a62d4d60f4] - (SEMVER-MINOR) stream: add readableDidRead if has been read from (Robert Nagy) #39589
  • [63502131a3] - (SEMVER-MINOR) stream: pipeline accept Buffer as a valid first argument (Nitzan Uziely) #37739
  • [68bbebd42c] - (SEMVER-MINOR) tls: allow reading data into a static buffer (Andrey Pechkurov) #35753
  • [1cbb74d63d] - (SEMVER-MINOR) url: expose urlToHttpOptions utility (Yongsheng Zhang) #35960
  • [8eb11356dd] - (SEMVER-MINOR) util: expose toUSVString (Robert Nagy) #39814
  • [84fcdc3074] - (SEMVER-MINOR) v8: implement v8.stopCoverage() (Joyee Cheung) #33807
  • [b238b6bf17] - (SEMVER-MINOR) v8: implement v8.takeCoverage() (Joyee Cheung) #33807
  • [9f6bc58da8] - (SEMVER-MINOR) worker: add setEnvironmentData/getEnvironmentData (James M Snell) #37486

Commits

Semver-minor commits

  • [f3563d3197] - (SEMVER-MINOR) async_hooks: use new v8::Context PromiseHook API (Stephen Belanger) #36394
  • [df37c106a7] - (SEMVER-MINOR) buffer: introduce Blob (James M Snell) #36811
  • [223494c548] - (SEMVER-MINOR) buffer: add base64url encoding option (Filip Skokan) #36952
  • [14fc4ddabc] - (SEMVER-MINOR) child_process: allow options.cwd receive a URL (Khaidi Chu) #38862
  • [b68b13acb3] - (SEMVER-MINOR) child_process: add timeout to spawn and fork (Nitzan Uziely) #37256
  • [da98c9f99b] - (SEMVER-MINOR) child_process: allow promisified exec to be cancel (Carlos Fuentes) #34249
  • [779310ac87] - (SEMVER-MINOR) child_process: add 'overlapped' stdio flag (Thiago Padilha) #29412
  • [40eb3b79f1] - (SEMVER-MINOR) cli: add -C alias for --conditions flag (Guy Bedford) #38755
  • [39eba0a2e1] - (SEMVER-MINOR) cli: add --node-memory-debug option (Anna Henningsen) #35537
  • [d9b58a0262] - (SEMVER-MINOR) deps: V8: cherry-pick fa4cb172cde2 (Stephen Belanger) #38577
  • [9d7177c152] - (SEMVER-MINOR) deps: V8: cherry-pick 4c074516397b (Stephen Belanger) #36394
  • [ec0f0ef8ef] - (SEMVER-MINOR) deps: V8: cherry-pick 5f4413194480 (Stephen Belanger) #36394
  • [3e7238e45a] - (SEMVER-MINOR) deps: V8: cherry-pick 272445f10927 (Stephen Belanger) #36394
  • [214e568597] - (SEMVER-MINOR) deps: V8: backport c0fceaa0669b (Stephen Belanger) #36394
  • [d8d9a9628a] - (SEMVER-MINOR) dns: add "tries" option to Resolve options (Luan Devecchi) #39610
  • [15ba19b020] - (SEMVER-MINOR) dns: allow --dns-result-order to change default dns verbatim (Ouyang Yadong) #38099
  • [defb77cac9] - (SEMVER-MINOR) doc: add missing change to resolver ctor (Luan Devecchi) #39610
  • [9ee3f77e32] - (SEMVER-MINOR) errors: remove experimental from --enable-source-maps (Benjamin Coe) #37362
  • [989c204a58] - (SEMVER-MINOR) fs: allow empty string for temp directory prefix (Voltrex) #39028
  • [ef72490cde] - (SEMVER-MINOR) fs: allow no-params fsPromises fileHandle read (Nitzan Uziely) #38287
  • [cad9d20f64] - (SEMVER-MINOR) fs: add support for async iterators to fsPromises.writeFile (HiroyukiYagihashi) #37490
  • [fe12cc07b3] - (SEMVER-MINOR) fs: add fsPromises.watch() (James M Snell) #37179
  • [2459c115a8] - (SEMVER-MINOR) fs: allow position parameter to be a BigInt in read and readSync (Darshan Sen) #36190
  • [6544cfb4b9] - (SEMVER-MINOR) http2: add support for sensitive headers (Anna Henningsen) #34145
  • [a6c6cbb4e6] - (SEMVER-MINOR) http2: allow setting the local window size of a session (Yongsheng Zhang) #35978
  • [93af04afbb] - (SEMVER-MINOR) module: add support for URL to import.meta.resolve (Antoine du Hamel) #38587
  • [f9f9389d83] - (SEMVER-MINOR) module: add support for node:‑prefixed require(…) calls (ExE Boss) #37246
  • [76d4f22bab] - (SEMVER-MINOR) net: allow net.BlockList to use net.SocketAddress objects (James M Snell) #37917
  • [82363d864d] - (SEMVER-MINOR) net: add SocketAddress class (James M Snell) #37917
  • [0202ba46b8] - (SEMVER-MINOR) net: make net.BlockList cloneable (James M Snell) #37917
  • [a41a3e3b3f] - (SEMVER-MINOR) net: make blocklist family case insensitive (James M Snell) #34864
  • [87c71065eb] - (SEMVER-MINOR) net: introduce net.BlockList (James M Snell) #34625
  • [b421d99a48] - (SEMVER-MINOR) node-api: allow retrieval of add-on file name (Gabriel Schulhof) #37195
  • [6a4811df8a] - (SEMVER-MINOR) os: add os.devNull (Luigi Pinca) #38569
  • [4a88ddeeca] - (SEMVER-MINOR) perf_hooks: introduce createHistogram (James M Snell) #37155
  • [1a6bf1c4a3] - (SEMVER-MINOR) process: add api to enable source-maps programmatically (legendecas) #39085
  • [99735a6fe8] - (SEMVER-MINOR) process: add 'worker' event (James M Snell) #38659
  • [3982919317] - (SEMVER-MINOR) process: add direct access to rss without iterating pages (Adrien Maret) #34291
  • [526e6c7bde] - (SEMVER-MINOR) readline: add AbortSignal support to interface (Nitzan Uziely) #37932
  • [e6eee08692] - (SEMVER-MINOR) readline: add support for the AbortController to the question method (Mattias Runge-Broberg) #33676
  • [32de361d70] - (SEMVER-MINOR) readline: add history event and option to set initial history (Mattias Runge-Broberg) #33662
  • [797f7f8a38] - (SEMVER-MINOR) repl: add auto‑completion for node:‑prefixed require(…) calls (ExE Boss) #37246
  • [abfd71b64c] - (SEMVER-MINOR) src: call overload ctor from the original ctor (Darshan Sen) #39768
  • [1efae01b18] - (SEMVER-MINOR) src: add a constructor overload for CallbackScope (Darshan Sen) #39768
  • [1aa2080d29] - (SEMVER-MINOR) src: fix align in cares_wrap.h (Luan) #39610
  • [f7933804ba] - (SEMVER-MINOR) src: allow to negate boolean CLI flags (Michaël Zasso) #39023
  • [6d06ac2202] - (SEMVER-MINOR) src: add --heapsnapshot-near-heap-limit option (Joyee Cheung) #33010
  • [4091eb9db7] - (SEMVER-MINOR) src: move node_binding to modern THROW_ERR* (James M Snell) #35469
  • [577d228ca0] - (SEMVER-MINOR) src: add way to get IsolateData and allocator from Environment (Anna Henningsen) #36441
  • [658a266cd4] - (SEMVER-MINOR) src: allow preventing SetPrepareStackTraceCallback (Shelley Vohr) #36447
  • [f421422ea4] - (SEMVER-MINOR) src: add maybe versions of EmitExit and EmitBeforeExit (Anna Henningsen) #35486
  • [a62d4d60f4] - (SEMVER-MINOR) stream: add readableDidRead if has been read from (Robert Nagy) #39589
  • [63502131a3] - (SEMVER-MINOR) stream: pipeline accept Buffer as a valid first argument (Nitzan Uziely) #37739
  • [72ef41c72b] - (SEMVER-MINOR) test: add wpt tests for Blob (Michaël Zasso) #36811
  • [68bbebd42c] - (SEMVER-MINOR) tls: allow reading data into a static buffer (Andrey Pechkurov) #35753
  • [587deacad9] - (SEMVER-MINOR) tools: add Worker to type-parser (James M Snell) #38659
  • [1cbb74d63d] - (SEMVER-MINOR) url: expose urlToHttpOptions utility (Yongsheng Zhang) #35960
  • [8eb11356dd] - (SEMVER-MINOR) util: expose toUSVString (Robert Nagy) #39814
  • [84fcdc3074] - (SEMVER-MINOR) v8: implement v8.stopCoverage() (Joyee Cheung) #33807
  • [b238b6bf17] - (SEMVER-MINOR) v8: implement v8.takeCoverage() (Joyee Cheung) #33807
  • [9f6bc58da8] - (SEMVER-MINOR) worker: add setEnvironmentData/getEnvironmentData (James M Snell) #37486

Semver-patch commits

  • [3a60de0135] - assert: change status of legacy asserts (James M Snell) #38113
  • [5a42be9719] - async_hooks: use resource stack for AsyncLocalStorage run (Stephen Belanger) #39890
  • [fc29ddb38e] - async_hooks: emit promise trace events from JS (Stephen Belanger) #39135
  • [13296d1abf] - async_hooks: eliminate native PromiseHook (Stephen Belanger) #39135
  • [48e5971e51] - async_hooks: check for empty contexts before removing (Bryan English) #39095
  • [691c00c48b] - async_hooks: switch between native and context hooks correctly (Stephen Belanger) #38912
  • [8484ab2a6c] - buffer: avoid creating the backing store in the thread (James M Snell) #37052
  • [c8d039a872] - buffer: make Blob's constructor more spec-compliant (Michaël Zasso) #37361
  • [05d73ac286] - buffer: make Blob's slice method more spec-compliant (Michaël Zasso) #37361
  • [e7cf2efc60] - buffer: add @@toStringTag to Blob (Colin Ihrig) #37336
  • [d99deeaf97] - build: fix update authors commit (Mestery) #39858
  • [5e1cba81bf] - build: add authors.yml (Tierney Cyren) #35831
  • [ed3c332089] - build: add option to hide console window (Cheng Zhao) #39712
  • [c696f97c5e] - build: exclude markdown files from some GitHub Actions (Rich Trott) #39565
  • [0bd6dd1ee2] - build: use lts shorthand in GitHub Actions (Rich Trott) #39538
  • [3482bca643] - build: override python executable path on configure (legendecas) #39465
  • [61261cdb8e] - build: use Node.js 14 in commit-lint.yml (Rich Trott) #39506
  • [719f1563c1] - build: fix host_arch_cc() for AIX/IBM i (Richard Lau) #39481
  • [6e06b2ff9d] - build: update coverage Makefile target comments (Richard Lau) #39365
  • [4e28d2b2c0] - build: run workflows when a PR is ready for review (Michaël Zasso) #39405
  • [0da5d74da4] - build: update to setup-node@v2 (Rich Trott) #39366
  • [f2e1c2267e] - build: update gcovr for gcc 8 compatibility (Richard Lau) #39326
  • [131dd6ec4d] - build: remove unused comment in Makefile (LitoMore) #39171
  • [40e46321b0] - build: uvwasi honours node_shared_libuv (Jérémy Lal) #39260
  • [5c6ab719f2] - build: shorten path used in tarball build workflow (Richard Lau) #39192
  • [870526374c] - build: add library_files to gyp variables (himself65) #39293
  • [0e221156aa] - build: pass directory instead of list of files to js2c.py (Joyee Cheung) #39069
  • [8d8415415b] - build: don't pass --mode argument to V8 test-runner (Richard Lau) #39055
  • [2d50217634] - build: fix commit linter on unrebased PRs (Mary Marchini) #39121
  • [c93d5e006e] - build: use Actions to validate commit message (Mary Marchini) #32417
  • [0bcaf9c4d1] - child_process: fix spawn and fork abort behavior (Nitzan Uziely) #37325
  • [8010c83180] - child_process: fix bad abort signal leak (Nitzan Uziely) #37257
  • [32aff2f5a0] - console: refactor to avoid unsafe array iteration (Antoine du Hamel) #36753
  • [f46e8cdf79] - debugger: remove undefined parameter (Rich Trott) #39570
  • [482459edd4] - debugger: validate sec-websocket-accept response header (Chris Opperwall) #39357
  • [e9c46107d7] - debugger: rename internal module (Rich Trott) #39378
  • [49e0883c75] - debugger: indicate server is ending (Rich Trott) #39334
  • [72a3419510] - debugger: rename inspector-cli test module to debugger (Rich Trott) #38530
  • [b3352cfba4] - debugger: prevent simultaneous heap snapshots (Rich Trott) #39638
  • [e5826ab1c2] - debugger: remove final lint exceptions in inspect_repl.js (Rich Trott) #39078
  • [34c0701952] - deps: V8: cherry-pick 00bb1a77c03e (Darshan Sen) #39829
  • [42359ab582] - deps: upgrade to libuv 1.42.0 (Luigi Pinca) #39525
  • [d863a9db68] - deps: bump HdrHistogram_C to 0.11.2 (Matteo Collina) #39462
  • [4c93968a62] - deps: extract gtest source files to deps/googletest (legendecas) #39386
  • [fcae391fed] - deps: update Acorn to v8.4.1 (Michaël Zasso) #39166
  • [327838dd96] - deps: V8: backport c9224589cf53 (Stephen Belanger) #39743
  • [89c1bbd7b2] - deps: V8: cherry-pick 81814ed44574 (Stephen Belanger) #39719
  • [8b9215d07c] - deps: update to cjs-module-lexer@1.2.2 (Guy Bedford) #39402
  • [e201293ddb] - dgram: use simplified validator (Voltrex) #39753
  • [6fdac38f91] - doc,fs: remove experimental status for WHATWG URL as path (Antoine du Hamel) #38870
  • [d56e8268f9] - doc,lib: prepare for stricter multi-line array linting (Rich Trott) #37088
  • [5500ae9236] - domain: do not add domain to promise from other context (Stephen Belanger) #39135
  • [dc855af18e] - errors: don't throw TypeError on missing export (Benjamin Coe) #39017
  • [c13eadc218] - errors: eliminate all overhead for hidden calls (Momtchil Momtchev) #35644
  • [d42bbe48c5] - esm: use correct URL for error decoration (Bradley Farias) #37854
  • [9db3304368] - esm: update to correct deprecation code (Colin Ihrig) #37147
  • [e73bfed2f4] - esm: deprecate legacy main lookup for modules (Guy Bedford) #36918
  • [c1782ea1f5] - events: allow the options argument to be null (Luigi Pinca) #39486
  • [d2834fb97f] - fs: improve fsPromises writeFile performance (Nitzan Uziely) #37610
  • [ee1d13c90d] - fs: use byteLength to handle ArrayBuffer views (Michaël Zasso) #38187
  • [b38d6b475b] - fs: fixup negative length in fs.truncate (James M Snell) #37483
  • [fe28128f3c] - fs: add docs and tests for AsyncIterable support in fh.writeFile (Antoine du Hamel) #39836
  • [2b0e2706c0] - fs: improve fsPromises readFile performance (Nitzan Uziely) #37608
  • [a4d6f78619] - fs: move constants to internal/fs/utils.js (Darshan Sen) #38061
  • [402f7722ce] - fs: add validatePosition and use in read and readSync (Darshan Sen) #37051
  • [2bc301dcff] - http: decodes url.username and url.password for authorization header (Lew Gordon) #39310
  • [5459f4af33] - http: clean up HttpParser correctly (Tobias Koppers) #39292
  • [8b3feee148] - http,https: align server option of https with http (Qingyu Deng) #38992
  • [cf59e87c8b] - inspector: update inspector_protocol to 89c4adf (Rich Trott) #39650
  • [ea5f2047a2] - inspector: update inspector_protocol to 8ec18cf (Rich Trott) #39614
  • [1e5aca550c] - inspector: mark as stable (Gireesh Punathil) #37748
  • [8a2ce5dae6] - inspector: move inspector async hooks to environment (Joyee Cheung) #39112
  • [338189ff6f] - lib: simplify validators (Voltrex) #39753
  • [e1019351e8] - lib: cleanup validation (Voltrex) #39652
  • [dbaf4988bc] - lib: use validators (Voltrex) #39663
  • [9c33e4bfb2] - lib: use validator (Voltrex) #39547
  • [5b1104291d] - lib: use validateObject (Voltrex) #39605
  • [1ce81079df] - lib: remove use of array destructuring (Antoine du Hamel) #36818
  • [b24b34effd] - lib: add bound apply variants of varargs primordials (ExE Boss) #37005
  • [7cdff9a6a8] - lib: refactor primordials.makeSafe to use more primordials (ExE Boss) #36865
  • [1737352580] - lib: comment explaining special-case handling of promises (Stephen Belanger) #39135
  • [7f54cccb6c] - lib: refactor to use validateString (ZiJian Liu) #37006
  • [98259dc527] - module: improve support of data: URLs (Antoine du Hamel) #37392
  • [9aba2888a1] - net: throw ERR_OUT_OF_RANGE if blockList.addSubnet prefix is NaN (ZiJian Liu) #36732
  • [2ca12c83b4] - node-api: handle pending exception in cb wrapper (Michael Dawson) #39476
  • [9e5edf2158] - node-api: cctest on v8impl::Reference (legendecas) #38970
  • [a74032a490] - node-api: rtn pending excep on napi_new_instance (legendecas) #38798
  • [bcb85adee6] - policy: canonicalize before resolving specifiers (Bradley Farias) #37863
  • [0ff520cf02] - policy: fix integrity when DEFAULT_ENCODING is set (Tobias Nießen) #39750
  • [6c87b591d9] - readline: allow completer to rewrite existing input (Anna Henningsen) #39178
  • [37b4708b19] - repl: fix tla function hoisting (Don Jayamanne) #39745
  • [9264caeafe] - repl: do not include legacy getter/setter methods in completion (Anna Henningsen) #39576
  • [50c5e71e22] - repl: correctly hoist top level await declarations (ejose19) #39265
  • [1e065a0a43] - repl: processTopLevelAwait fallback error handling (ejose19) #39290
  • [99664494ff] - repl: ensure correct syntax err for await parsing (Guy Bedford) #39154
  • [761dafafde] - repl: fix Ctrl+C on top level await (Antoine du Hamel) #38656
  • [88b02cbb08] - repl: add auto‑completion for dynamic import calls (ExE Boss) #37178
  • [8f3a8830ba] - repl: refactor to avoid unsafe array iteration (Antoine du Hamel) #37188
  • [a48e2d6ec7] - repl: refactor to avoid unsafe array iteration (Darshan Sen) #36663
  • [20ffadf437] - repl: refactor to use more primordials (Antoine du Hamel) #36264
  • [f69c934ad4] - report: generates report on threads with no isolates (legendecas) #38994
  • [c4686fa5a7] - src: fix TextDecoder final flush size calculation (James M Snell) #39737
  • [495cd02c20] - src: add cosmetic space character to async_wrap.h file (Juan José Arboleda) #39459
  • [985ec48975] - src: print native module id on native module not found (legendecas) #39460
  • [e6ff7e648e] - src: close HandleWraps instead of deleting them in OnGCCollect() (Anna Henningsen) #39441
  • [5c473bdc12] - src: remove unused guards around node-api reference (legendecas) #38334
  • [41213bd507] - src: add JSDoc typings for v8 (Voltrex) #38944
  • [02b1df9fac] - src: fix crash in AfterGetAddrInfo (Anna Henningsen) #39735
  • [99493b07d4] - src: fix fatal errors when a current isolate not exist (legendecas) #38624
  • [9433c28c14] - src: remove more extra semis from member fns (Shelley Vohr) #38744
  • [bad990c934] - src: use BaseObject::kInteralFieldCount in Blob (Joyee Cheung) #36991
  • [0a759dff52] - src: compare IPv4 addresses in host byte order (Colin Ihrig) #39096
  • [d73181f243] - src: reduce duplicated boilerplate with new env utility fn (James M Snell) #36536
  • [85af15a8b6] - src: allow instances of net.BlockList to be created internally (James M Snell) #34741
  • [1008c80176] - src: add SocketAddressLRU Utility (James M Snell) #34618
  • [e404841a9c] - src: set PromiseHooks by Environment (Bryan English) #38821
  • [c8c290ae8f] - src,zlib: tighten up Z_*_WINDOWBITS macros (Khaidi Chu) #39115
  • [de171177b4] - stream: clean endWritableNT (Mestery) #39645
  • [32a5b8f59b] - stream: move duplicated code to an internal module (Rich Trott) #37508
  • [f90b22d351] - util: add internal createDeferredPromise() (Colin Ihrig) #37095
  • [61b4a98480] - zlib: avoid converting Uint8Array instances to Buffer (Antoine du Hamel) #39492

Documentation commits

Other commits

  • [ab66dabbf2] - doc,meta: update email addresses for misterdjules (Rich Trott) #39433
  • [c6ccd97fe2] - doc,tools: remove checkLinks.mjs (Antoine du Hamel) #39206
  • [8f8f528f08] - meta: add gyp as owner of gyp files and tools/gyp (Mary Marchini) #34847
  • [4b2eee5232] - meta: consolidate AUTHORS entries for ooHmartY (Rich Trott) #39705
  • [6916a6c2b0] - meta: consolidate AUTHORS entries for homosaur (Rich Trott) #39705
  • [b65a635c8a] - meta: consolidate AUTHORS entries for Ayase-252 (Rich Trott) #39705
  • [e86b59cf4c] - meta: consolidate AUTHORS entries for robin-drexler (Rich Trott) #39705
  • [1eda8442bd] - meta: consolidate AUTHORS entries for samshull (Rich Trott) #39705
  • [cd67d86572] - meta: update AUTHORS (Rich Trott) #39705
  • [bb06282a9e] - meta: consolidate email addresses for MarshallOfSound (Rich Trott) #39651
  • [12fe34eae4] - meta: consolidate email addresses for tadjik1 (Rich Trott) #39651
  • [4301e252b4] - meta: consolidate email addresses for szmarczak (Rich Trott) #39651
  • [3e8fc49730] - meta: update AUTHORS (Rich Trott) #39636
  • [60f41c34dd] - meta: simplify mailmap (Rich Trott) #39612
  • [fc9c680260] - meta: consolidate emails for tadhgcreedon (Rich Trott) #39611
  • [d87fcf9959] - meta: consolidate emails for timcosta (Rich Trott) #39611
  • [fdbe97849b] - meta: consolidate emails for timruffles (Rich Trott) #39611
  • [b9f2ea92e9] - meta: update AUTHORS (Rich Trott) #39629
  • [472cf1520e] - meta: add mailmap entry for ryzokuken (Rich Trott) #39596
  • [ae3f8b1eda] - meta: add mailmap entry for uttampawar (Rich Trott) #39596
  • [2a2d8ebd90] - meta: add mailmap entry for dmabupt (Rich Trott) #39596
  • [030036ec92] - meta: align README/.mailmap/AUTHORS email entries (Rich Trott) #39505
  • [fd2146be91] - meta: add mailmap entry for garygsc (Rich Trott) #39588
  • [0833e2d9cb] - meta: add mailmap entry for ttzztztz (Rich Trott) #39588
  • [1fbc19ee32] - meta: update AUTHORS (Rich Trott) #39587
  • [2d6428665d] - meta: update .mailmap to remove duplication in AUTHORS (Rich Trott) #39561
  • [6c4febd701] - meta: add .mailmap entries to remove AUTHORS duplicates (Rich Trott) #39560
  • [1755f49a20] - meta: add .mailmap entry to remove duplication in AUTHORS (Rich Trott) #39559
  • [fdcc5729d9] - meta: update collaborator email in AUTHORS/.mailmap (Rich Trott) #39521
  • [27e9a44852] - meta: update collaborator email in README (Rich Trott) #39521
  • [5e1c49ff0f] - meta: update collaborator email in AUTHORS/.mailmap (Rich Trott) #39521
  • [fbecae169e] - meta: move gdams to emeritus (Rich Trott) #39539
  • [48ec33f1b8] - meta: update collaborator email in README (Rich Trott) #39510
  • [f269df31ea] - meta: remove unneeded .mailmap entry (Rich Trott) #39512
  • [b0c1aab28d] - meta: update email address for collaborator (Rich Trott) #39511
  • [5f4935292a] - meta: align collaborator name in .mailmap/AUTHORS with README (Rich Trott) #39489
  • [1b2078c912] - meta: align email address in README/.mailmap/AUTHORS (Rich Trott) #39503
  • [2f816bf24b] - meta: revise .mailmap for README consistency (Rich Trott) #39457
  • [1302a911f5] - meta: alphabetize .mailmap file (Rich Trott) #39434
  • [55322c0260] - meta: align collaborator email in .mailmap/AUTHORS with README (Rich Trott) #39478
  • [83f5cc0bd4] - meta: update AUTHORS (Rich Trott) #39461
  • [69b56a3fe9] - meta: add .mailmap entry for new email for existing contributor (Rich Trott) #39431
  • [2f325c946f] - meta: use form schema for bug report template (Michaël Zasso) #39194
  • [9766a99dd2] - meta: add @nodejs/actions as CODEOWNERS (Mary Marchini) #39119
  • [007f9a0e36] - test: fix test-vm-memleak for high baseline platforms (Rich Trott) #38062
  • [0fabd8e755] - test: fix flaky test-vm-memleak (Rich Trott) #38054
  • [64fb928ec7] - test: fix flaky test-child-process-exec-abortcontroller-promisified (Antoine du Hamel) #37572
  • [e660892f1a] - test: use simplfied validator (voltrexmaster) #39753
  • [779417f97e] - test: use template to concatenate string (Himadri Ganguly) #39621
  • [a61076042d] - test: deflake test-http2-buffersize (Luigi Pinca) #39591
  • [68ef265c39] - test: convert anonymous function to arrow function (Himadri Ganguly) #39604
  • [78db43c9e7] - test: add test-debugger-breakpoint-exists (Rich Trott) #39570
  • [5696bcf715] - test: fix WASI link test (Richard Lau) #39485
  • [0b564a6d40] - test: add test for WebSocket secret verification in debugger (Rich Trott) #39357
  • [831f266d6f] - test: put common lint exceptions into config file (Rich Trott) #39358
  • [d8066f5325] - test: mark test-domain-error-types flaky (James M Snell) #39369
  • [c915a1bd04] - test: remove eslint-disable comment from fixture file (Rich Trott) #39320
  • [1eb8307cc5] - test: move debugger test case to parallel (Rich Trott) #39300
  • [546202364c] - test: remove debugger workaround for AIX (Rich Trott) #39296
  • [e12164e88d] - test: fix test-debugger-heap-profiler for workers (Richard Lau) #39687
  • [a45bf2f1a0] - test: use common.PORT instead of hardcoded port number (Rich Trott) #39298
  • [9b737ebd4b] - test: add test for debugger restart message issue (Rich Trott) #39273
  • [68523894ab] - test: remove workaround code in debugger test (Rich Trott) #39238
  • [2cd414147b] - test: move test-debugger-address to parallel (Rich Trott) #39236
  • [a2e4020e4b] - test: prepare for consistent comma-dangle lint rule (Rich Trott) #37930
  • [62b439e04d] - test: replace "inspector-cli" with "debugger" (Rich Trott) #39156
  • [f13a302d23] - test: improve coverage of stream.Readable (Rongjian Zhang) #38702
  • [f3d2e6ac29] - test: add tests for bound apply variants of varargs primordials (ExE Boss) #37005
  • [f70fd00fb3] - test: use localhost test instead of connecting to remote (Adam Majer) #39011
  • [c4ff5e4a7e] - test: update error message keywords (leeight) #39826
  • [922dacebfb] - test: increase coverage for Blob (ZiJian Liu) #38515
  • [c6ab19895d] - test: account for OOM risks in heapsnapshot-near-heap-limit tests (Joyee Cheung) #37761
  • [971d5be57c] - test: split heap snapshot limit tests (Rich Trott) #37189
  • [815d59a7b3] - test: fix test-memory-usage.js for IBMi (Rich Trott) #36758
  • [aa5309c33f] - test: increase coverage for net/blocklist (Zijian Liu) #36405
  • [f3be3ec417] - test: check mustCall errors in test-fs-read-type (Tobias Nießen) #36914
  • [b643fe7edf] - test: use faster variant for rss (Pooja D P) #36839
  • [d4362db111] - test: use faster variant for rss in test-crypto-dh-leak (Pooja D P) #36766
  • [3094ef967a] - test: use faster variant for rss in test-vm-memleak.js (Pooja D P) #36769
  • [ff7879b41e] - test: use faster variant for rss test-memoryusage-emfile (Pooja D P) #36768
  • [d39200c7f4] - tools: make utils.SearchFiles Python2-compatible (Michaël Zasso) #40020
  • [55493f2011] - tools: update workflow to open a pull request (Rich Trott) #39825
  • [417a3ac474] - tools: use find-inactive-collaborators to modify README.md (Rich Trott) #39825
  • [e9b1a006a1] - tools: fix markdown linting (Rich Trott) #39832
  • [67f1bff657] - tools: update markdown linter dependencies and move to ESM (Antoine du Hamel) #39801
  • [67c5921e8a] - tools: update rollup to latest version in markdown linter (Rich Trott) #39797
  • [64714b429a] - tools: update markdown lint dependencies (Rich Trott) #39770
  • [de9461168a] - tools: bump remark-preset-lint-node to 3.0.0 (Rich Trott) #39755
  • [dfdf6c7317] - tools: update markdown linter rules (Rich Trott) #38384
  • [f8fee449f7] - tools: update path-parse in markdown linter package-lock file (Rich Trott) #39729
  • [a338c0e07b] - tools: fix more build warnings in inspector_protocol (Richard Lau) #39725
  • [09630cf199] - tools: cherry-pick ffb34b6d5dbf0 (Darshan Sen) #39725
  • [26a067e33e] - tools: update inspector_protocol to e8ba1a7 (Rich Trott) #39694
  • [9847d58feb] - tools: update inspector_protocol to 39ca567 (Rich Trott) #39694
  • [6870bb7505] - tools: update inspector_protocol to 97d3146 (Rich Trott) #39694
  • [383fa01e97] - Revert "tools: fix compiler warning in inspector_protocol" (Rich Trott) #39694
  • [b95a759c86] - tools: update inspector_protocol to a53e96d31a2755eb16ca37 (Rich Trott) #39694
  • [ad39687422] - tools: update inspector_protocol to fe0467fd105a (Rich Trott) #39694
  • [78de83cc74] - tools: improve error detection in find-inactive-collaborators (Rich Trott) #39617
  • [a5152a0875] - tools: flag README/mailmap mismatches in find-inactive-collaborators (Rich Trott) #39477
  • [87c5332f89] - tools: use mailmap for find-inactive-collaborators (Rich Trott) #39432
  • [f75224f1ce] - tools: email matchin is case insensitive for .mailmap (Rich Trott) #39430
  • [dfb77a581f] - tools: make internal link checker more robust (Rich Trott) #39429
  • [d2c0da20a0] - tools: added remark-frontmatter (Ben Halverson) #38717
  • [cec04821aa] - tools: change commit fetch limiting in find-inactive-collaborators (Rich Trott) #39362
  • [d948148498] - tools: use Node.js 16.x for GitHub workflow (Rich Trott) #39362
  • [edc5791b5a] - tools: add GitHub Action to run find-inactive-collaborators.mjs (Rich Trott) #39335
  • [d86d37bc9e] - tools: relax max-len lint rule for template strings (Rich Trott) #38097
  • [f467e2a0c5] - tools: pass bot token to node-pr-labeler (Michaël Zasso) #39271
  • [61ec594609] - tools: add find-inactive-collaborators.js (Rich Trott) #39262
  • [ff0ca11521] - tools: update path-parse to 1.0.7 (Rich Trott) #39232
  • [b8fb75121b] - tools: remove unused lint-pr-commit-message.sh (Richard Lau) #39120
  • [e7761b627f] - tools: apply consistent comma-dangle lint rule (Rich Trott) #37930
  • [315eba7789] - tools: make comma-dangle ESLint rule more stringent … (Rich Trott) #37088
  • [3ecfe9d7ee] - tools: update remark-preset-lint-node to 2.4.1 (Rich Trott) #39201
  • [70e527c0c7] - tools: upgrade highlight.js to version 11.0.1 (Antoine du Hamel) #39032
  • [7b2bebba7a] - tools: add support for import assertions in linter (Antoine du Hamel) #39924
  • [1353a6e22f] - tools: update ESLint to 7.32.0 (Luigi Pinca) #39602
  • [509f26549c] - tools: update ESLint to 7.31.0 (Colin Ihrig) #39424
  • [f0e0c8f720] - tools: update ESLint to 7.30.0 (Colin Ihrig) #39242
  • [6540c271e4] - tools: update @babel/eslint-parser to 7.14.7 (Rich Trott) #39160
  • [d7e2318e74] - tools: add ESLint rule no-array-destructuring (Antoine du Hamel) #36818
  • [87e5429334] - tools,doc: fix error message for unrecognized type (Antoine du Hamel) #39221
  • [f206af679c] - typings: add a few JSDoc typings for the net lib module (nerdthatnoonelikes) #38953
  • [d458cd7e2b] - typings: add JSDoc typings for timers (Voltrex) #38834

Windows 32-bit Installer: https://nodejs.org/dist/v14.18.0/node-v14.18.0-x86.msi
Windows 64-bit Installer: https://nodejs.org/dist/v14.18.0/node-v14.18.0-x64.msi
Windows 32-bit Binary: https://nodejs.org/dist/v14.18.0/win-x86/node.exe
Windows 64-bit Binary: https://nodejs.org/dist/v14.18.0/win-x64/node.exe
macOS 64-bit Installer: https://nodejs.org/dist/v14.18.0/node-v14.18.0.pkg
macOS Intel 64-bit Binary: https://nodejs.org/dist/v14.18.0/node-v14.18.0-darwin-x64.tar.gz
Linux 64-bit Binary: https://nodejs.org/dist/v14.18.0/node-v14.18.0-linux-x64.tar.xz
Linux PPC LE 64-bit Binary: https://nodejs.org/dist/v14.18.0/node-v14.18.0-linux-ppc64le.tar.xz
Linux s390x 64-bit Binary: https://nodejs.org/dist/v14.18.0/node-v14.18.0-linux-s390x.tar.xz
AIX 64-bit Binary: https://nodejs.org/dist/v14.18.0/node-v14.18.0-aix-ppc64.tar.gz
ARMv7 32-bit Binary: https://nodejs.org/dist/v14.18.0/node-v14.18.0-linux-armv7l.tar.xz
ARMv8 64-bit Binary: https://nodejs.org/dist/v14.18.0/node-v14.18.0-linux-arm64.tar.xz
Source Code: https://nodejs.org/dist/v14.18.0/node-v14.18.0.tar.gz
Other release files: https://nodejs.org/dist/v14.18.0/
Documentation: https://nodejs.org/docs/v14.18.0/api/

SHASUMS

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

887bdbc61250431cf2fe062f55124833c02a9386fdd234d11aa868049ac1858e  node-v14.18.0-aix-ppc64.tar.gz
6b9b4d60bcb4eba95488380be8c4da4af98fce3f4a01c9a76db881cbb736656d  node-v14.18.0-darwin-x64.tar.gz
967e74229ba12141487b38bc4911125efd01397a35ec149db264b277792be8b1  node-v14.18.0-darwin-x64.tar.xz
a812312e2b82eee186d14cf78f08e1f3b5a397ecfaada3d7e574a070c50586b6  node-v14.18.0-headers.tar.gz
e06bf25f5f7b8cfcf0e6713b19e44f80011527e90de067e38740b6036eccbd5e  node-v14.18.0-headers.tar.xz
6261a87bf25d08e7b39017a1486b04c65be3ea0ea8442c090e1e4ec4d4cc6ebd  node-v14.18.0-linux-arm64.tar.gz
572cb0d673e0d67f141a64cbe27aeceef41d421e9c763966289d9816d7931711  node-v14.18.0-linux-arm64.tar.xz
3e1ef643adf658a7a27335a2f8efadba85ef9e5bdfffe121e18870537782691b  node-v14.18.0-linux-armv7l.tar.gz
7d6cc474230524f32a87f9c5eea24a2f53d7cac59d6d4db28a2e62a1eda10407  node-v14.18.0-linux-armv7l.tar.xz
67941bfa506372f0e82b6c75a88e9af2407e2b51da67665b6ccaef0bfb3fe14e  node-v14.18.0-linux-ppc64le.tar.gz
35fbf2fbab586ae06cb2440c8169bff6573991e81e95628e3d8af777e6c17c7f  node-v14.18.0-linux-ppc64le.tar.xz
976a57b21162cf731028a756da565ad68a39b87314b6d2afb2903d1dcc43b3f0  node-v14.18.0-linux-s390x.tar.gz
d6efddd341d77612186aa847dd4a7ae5905dc303506c227f2e9a25b94b4a7622  node-v14.18.0-linux-s390x.tar.xz
f411b8aee36d6dc6a5435906f42bd4ea59d6f678894cf562beaf115b58a318ee  node-v14.18.0-linux-x64.tar.gz
5c0bc18b19fd09ff80beb16772e69cb033ee4992a4ccd35bd884fd8f02e6d1ec  node-v14.18.0-linux-x64.tar.xz
81c3bcf76ddf5c7c1fbdf587c8fcdbd765c1533019bfe4ed2310eb0eeabf77ab  node-v14.18.0.pkg
2272312d7eb48a28e982af395142d916385b0572380d07c89f9abd9c97810189  node-v14.18.0.tar.gz
6b485158a0ae4e936346b45da6fdd2ee96cecfef82fce86f281e6bfa14d85859  node-v14.18.0.tar.xz
7a6681ec8df968421abec28d6fa957fdddf1f7708e52ac0d069e0108a4baa0e5  node-v14.18.0-win-x64.7z
2883e83ac3b1e1cb9a9bf65554043640849b39e86761e7c7ac50b664f42f20ff  node-v14.18.0-win-x64.zip
aa2da586d71437468f36c7e5f7143a2f5f8589e2e2d47c4303b2e221a498ecc6  node-v14.18.0-win-x86.7z
56973b1a9a7cd800e5dbf3cc14a73a99012f52402df9eaded081014f2dfd209e  node-v14.18.0-win-x86.zip
35aafc1b10f7041b1f361fb042f32a6484482ff8633cb9c5ffca36c10ef97536  node-v14.18.0-x64.msi
bf9a25b6f57e1e00bc0571068537c6f8da81f8d8c12b511090fe3b85ca7343a0  node-v14.18.0-x86.msi
262ec7222031430cb25fee4c3e932a94afe65a3614ecc8b68e47cfac4f76e7b3  win-x64/node.exe
6b6ed13aca8d29bfea176b171ef8630ce5bfcc528dca43d985452b9a9948f4bf  win-x64/node.lib
5b0773c40b17e4a3927b2755dbae2920cfdb070e0e100642a351034085b37e5f  win-x64/node_pdb.7z
a8628bf0e2c7ada4712887293017b7bff69348ddb5bd68fbd2aeac8e738c125f  win-x64/node_pdb.zip
b242b6ff988a6bb79cb0bd231b918e380fb77ad8085b1bcf4ef1f49751720533  win-x86/node.exe
2fa2858c3c462b6e1d94ba57ad0adc0b68dd6fa5693e7b8dc33b43c191c8a15f  win-x86/node.lib
927902b874a97d2c3e8b0ff1b85e16c50ed812bfef1a71d477a9e2f9da4449e1  win-x86/node_pdb.7z
9bad7e0bd46bf1db3ce1c48147018cd7ce7a712872cb9795c591bd8319d75678  win-x86/node_pdb.zip
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEEj8yhP+8dDC6RAI4Jdw96mlrhVgAFAmFS/WQACgkQdw96mlrh
VgA1IQ/9Hy1WMliGwlUd/VWxbVSBsj1XeCnZSGjTRJTVQR9BS33ZsUJlD9okgrCi
YGF/512BcZkm2406Jpi7FlZUR+RNjv4JKgylL6WWWtp148H+IbQ7nMl9z6DRGsCU
DNGdDt6FVAiX+zBm44LLAQ/nUNCGp+GzXfsT3LQFGXJ/7ShgBoNjQnoPKKdYks8e
7B3rN6m4/q+bnZfDGxgjk4dQRmTr9L+Yb4zc9NS9QLD0zd/evhyaS+3ZCiWf5mnR
+eBKcmGAh2IkXHiPARe4qCGop0vqmjDw4HRFce40wp7t0RlOeUDxg0rFgwnwUxwj
0bTKhqQH6BGlesAMghvMMGTNZSO+29d0P3EwrbFPySmGnJrJOlD+CK3wYjO2Ub2T
hIXfKiJKPtawGJISCawsNN+D75Crp/bDtLVkK1HbZnJQmVOcWhsjf2L5kFveVdX7
VZQD6XwQn1s6VF1ZZDSnLXncM7yGpPQfrJLJ8DNFOCLzJiZhiQIQd+RnpxCFpRs6
ahUWHtfQ4VeP5TXqnd647EhLIq6+fsMcHcvNcrb1VEL0nxctk9aHff/KXdF8gEWs
965vZCyfBIDaql/1fdcATKh9GwTky5b+Ysq3hOnyx6G3FAiC/dp8tYUrEC2z1jSs
bwobLsvIs9tgHdh5iocffUxg7CcoxcnRPNRTqm8lfDKYk+rDMr4=
=QWM8
-----END PGP SIGNATURE-----

Original article at https://nodejs.org

#node #nodejs 

What is GEEK

Buddha Community

Node.js v14.18.0 LTS has been released!

Py Rouge: A Full Python Implementation Of The ROUGE Metric

Py-rouge

A full Python implementation of the ROUGE metric, producing same results as in the official perl implementation.

Important remarks

  • The original Porter stemmer in NLTK is slightly different than the one use in the official ROUGE perl script as it has been written by end. Therefore, there might be slightly different stems for certain words. For DUC2004 dataset, I have identified these words and this script produces same stems.
  • The official ROUGE perl script use resampling strategy to compute the average with confidence intervals. Therefore, we might have a difference <3e-5 for ROUGE-L as well as ROUGE-W and <4e-5 for ROUGE-N.
  • Finally, ROUGE-1.5.5. has a bug: should have $tmpTextLen += $sLen at line 2101. Here, the last sentence, $limitBytes is taken instead of $limitBytes-$tmpTextLen (as $tmpTextLen is never updated with bytes length limit). It has been fixed in this code. This bug does not have a consequence for the default evaluation -b 665.

In case of doubts, please see all the implemented tests to compare outputs between the official ROUGE-1.5.5 and this script.

Installation

Package is uploaded on PyPI <https://pypi.org/project/py-rouge>_.

You can install it with pip:

pip install py-rouge

or do it manually:

git clone https://github.com/Diego999/py-rouge
cd py-rouge
python setup.py install

Issues/Pull Requests/Feedbacks

Don't hesitate to contact for any feedback or create issues/pull requests (especially if you want to rewrite the stemmer implemented in ROUGE-1.5.5 in python ;)).

Example

import rouge


def prepare_results(m, p, r, f):
    return '\t{}:\t{}: {:5.2f}\t{}: {:5.2f}\t{}: {:5.2f}'.format(m, 'P', 100.0 * p, 'R', 100.0 * r, 'F1', 100.0 * f)


for aggregator in ['Avg', 'Best', 'Individual']:
    print('Evaluation with {}'.format(aggregator))
    apply_avg = aggregator == 'Avg'
    apply_best = aggregator == 'Best'

    evaluator = rouge.Rouge(metrics=['rouge-n', 'rouge-l', 'rouge-w'],
                           max_n=4,
                           limit_length=True,
                           length_limit=100,
                           length_limit_type='words',
                           apply_avg=apply_avg,
                           apply_best=apply_best,
                           alpha=0.5, # Default F1_score
                           weight_factor=1.2,
                           stemming=True)


    hypothesis_1 = "King Norodom Sihanouk has declined requests to chair a summit of Cambodia 's top political leaders , saying the meeting would not bring any progress in deadlocked negotiations to form a government .\nGovernment and opposition parties have asked King Norodom Sihanouk to host a summit meeting after a series of post-election negotiations between the two opposition groups and Hun Sen 's party to form a new government failed .\nHun Sen 's ruling party narrowly won a majority in elections in July , but the opposition _ claiming widespread intimidation and fraud _ has denied Hun Sen the two-thirds vote in parliament required to approve the next government .\n"
    references_1 = ["Prospects were dim for resolution of the political crisis in Cambodia in October 1998.\nPrime Minister Hun Sen insisted that talks take place in Cambodia while opposition leaders Ranariddh and Sam Rainsy, fearing arrest at home, wanted them abroad.\nKing Sihanouk declined to chair talks in either place.\nA U.S. House resolution criticized Hun Sen's regime while the opposition tried to cut off his access to loans.\nBut in November the King announced a coalition government with Hun Sen heading the executive and Ranariddh leading the parliament.\nLeft out, Sam Rainsy sought the King's assurance of Hun Sen's promise of safety and freedom for all politicians.",
                    "Cambodian prime minister Hun Sen rejects demands of 2 opposition parties for talks in Beijing after failing to win a 2/3 majority in recent elections.\nSihanouk refuses to host talks in Beijing.\nOpposition parties ask the Asian Development Bank to stop loans to Hun Sen's government.\nCCP defends Hun Sen to the US Senate.\nFUNCINPEC refuses to share the presidency.\nHun Sen and Ranariddh eventually form a coalition at summit convened by Sihanouk.\nHun Sen remains prime minister, Ranariddh is president of the national assembly, and a new senate will be formed.\nOpposition leader Rainsy left out.\nHe seeks strong assurance of safety should he return to Cambodia.\n",
                    ]

    hypothesis_2 = "China 's government said Thursday that two prominent dissidents arrested this week are suspected of endangering national security _ the clearest sign yet Chinese leaders plan to quash a would-be opposition party .\nOne leader of a suppressed new political party will be tried on Dec. 17 on a charge of colluding with foreign enemies of China '' to incite the subversion of state power , '' according to court documents given to his wife on Monday .\nWith attorneys locked up , harassed or plain scared , two prominent dissidents will defend themselves against charges of subversion Thursday in China 's highest-profile dissident trials in two years .\n"
    references_2 = "Hurricane Mitch, category 5 hurricane, brought widespread death and destruction to Central American.\nEspecially hard hit was Honduras where an estimated 6,076 people lost their lives.\nThe hurricane, which lingered off the coast of Honduras for 3 days before moving off, flooded large areas, destroying crops and property.\nThe U.S. and European Union were joined by Pope John Paul II in a call for money and workers to help the stricken area.\nPresident Clinton sent Tipper Gore, wife of Vice President Gore to the area to deliver much needed supplies to the area, demonstrating U.S. commitment to the recovery of the region.\n"

    all_hypothesis = [hypothesis_1, hypothesis_2]
    all_references = [references_1, references_2]

    scores = evaluator.get_scores(all_hypothesis, all_references)

    for metric, results in sorted(scores.items(), key=lambda x: x[0]):
        if not apply_avg and not apply_best: # value is a type of list as we evaluate each summary vs each reference
            for hypothesis_id, results_per_ref in enumerate(results):
                nb_references = len(results_per_ref['p'])
                for reference_id in range(nb_references):
                    print('\tHypothesis #{} & Reference #{}: '.format(hypothesis_id, reference_id))
                    print('\t' + prepare_results(metric,results_per_ref['p'][reference_id], results_per_ref['r'][reference_id], results_per_ref['f'][reference_id]))
            print()
        else:
            print(prepare_results(metric, results['p'], results['r'], results['f']))
    print()

It produces the following output:

Evaluation with Avg
    rouge-1:    P: 28.62    R: 26.46    F1: 27.49
    rouge-2:    P:  4.21    R:  3.92    F1:  4.06
    rouge-3:    P:  0.80    R:  0.74    F1:  0.77
    rouge-4:    P:  0.00    R:  0.00    F1:  0.00
    rouge-l:    P: 30.52    R: 28.57    F1: 29.51
    rouge-w:    P: 15.85    R:  8.28    F1: 10.87

Evaluation with Best
    rouge-1:    P: 30.44    R: 28.36    F1: 29.37
    rouge-2:    P:  4.74    R:  4.46    F1:  4.59
    rouge-3:    P:  1.06    R:  0.98    F1:  1.02
    rouge-4:    P:  0.00    R:  0.00    F1:  0.00
    rouge-l:    P: 31.54    R: 29.71    F1: 30.60
    rouge-w:    P: 16.42    R:  8.82    F1: 11.47

Evaluation with Individual
    Hypothesis #0 & Reference #0: 
        rouge-1:    P: 38.54    R: 35.58    F1: 37.00
    Hypothesis #0 & Reference #1: 
        rouge-1:    P: 45.83    R: 43.14    F1: 44.44
    Hypothesis #1 & Reference #0: 
        rouge-1:    P: 15.05    R: 13.59    F1: 14.29

    Hypothesis #0 & Reference #0: 
        rouge-2:    P:  7.37    R:  6.80    F1:  7.07
    Hypothesis #0 & Reference #1: 
        rouge-2:    P:  9.47    R:  8.91    F1:  9.18
    Hypothesis #1 & Reference #0: 
        rouge-2:    P:  0.00    R:  0.00    F1:  0.00

    Hypothesis #0 & Reference #0: 
        rouge-3:    P:  2.13    R:  1.96    F1:  2.04
    Hypothesis #0 & Reference #1: 
        rouge-3:    P:  1.06    R:  1.00    F1:  1.03
    Hypothesis #1 & Reference #0: 
        rouge-3:    P:  0.00    R:  0.00    F1:  0.00

    Hypothesis #0 & Reference #0: 
        rouge-4:    P:  0.00    R:  0.00    F1:  0.00
    Hypothesis #0 & Reference #1: 
        rouge-4:    P:  0.00    R:  0.00    F1:  0.00
    Hypothesis #1 & Reference #0: 
        rouge-4:    P:  0.00    R:  0.00    F1:  0.00

    Hypothesis #0 & Reference #0: 
        rouge-l:    P: 42.11    R: 39.39    F1: 40.70
    Hypothesis #0 & Reference #1: 
        rouge-l:    P: 46.19    R: 43.92    F1: 45.03
    Hypothesis #1 & Reference #0: 
        rouge-l:    P: 16.88    R: 15.50    F1: 16.16

    Hypothesis #0 & Reference #0: 
        rouge-w:    P: 22.27    R: 11.49    F1: 15.16
    Hypothesis #0 & Reference #1: 
        rouge-w:    P: 24.56    R: 13.60    F1: 17.51
    Hypothesis #1 & Reference #0: 
        rouge-w:    P:  8.29    R:  4.04    F1:  5.43

Download Details:

Author: Diego999
Source Code: https://github.com/Diego999/py-rouge

License: Apache-2.0 license

#perl #python 

Substrate Parachain Template: A New Cumulus-based Substrate Node

Substrate Cumulus Parachain Template

A new Cumulus-based Substrate node, ready for hacking :cloud:

This project is a fork of the Substrate Node Template modified to include dependencies required for registering this node as a parathread or parachain to an established relay chain.

👉 Learn more about parachains here, and parathreads here.

Build & Run

Follow these steps to prepare a local Substrate development environment :hammer_and_wrench:

Setup of Machine

If necessary, refer to the setup instructions at the Substrate Developer Hub.

Build

Once the development environment is set up, build the Cumulus Parachain Template. This command will build the Wasm Runtime and native code:

cargo build --release

Relay Chain

NOTE: In the following two sections, we document how to manually start a few relay chain nodes, start a parachain node (collator), and register the parachain with the relay chain.

We also have the polkadot-launch CLI tool that automate the following steps and help you easily launch relay chains and parachains. However it is still good to go through the following procedures once to understand the mechanism for running and registering a parachain.

To operate a parathread or parachain, you must connect to a relay chain. Typically you would test on a local Rococo development network, then move to the testnet, and finally launch on the mainnet. Keep in mind you need to configure the specific relay chain you will connect to in your collator chain_spec.rs. In the following examples, we will use rococo-local as the relay network.

Build Relay Chain

Clone and build Polkadot (beware of the version tag we used):

# Get a fresh clone, or `cd` to where you have polkadot already:
git clone -b v0.9.7 --depth 1 https://github.com/paritytech/polkadot.git
cd polkadot
cargo build --release

Generate the Relay Chain Chainspec

First, we create the chain specification file (chainspec). Note the chainspec file must be generated on a single node and then shared among all nodes!

👉 Learn more about chain specification here.

./target/release/polkadot build-spec \
--chain rococo-local \
--raw \
--disable-default-bootnode \
> rococo_local.json

Start Relay Chain

We need n + 1 full validator nodes running on a relay chain to accept n parachain / parathread connections. Here we will start two relay chain nodes so we can have one parachain node connecting in later.

From the Polkadot working directory:

# Start Relay `Alice` node
./target/release/polkadot \
--chain ./rococo_local.json \
-d /tmp/relay/alice \
--validator \
--alice \
--port 50555

Open a new terminal, same directory:

# Start Relay `Bob` node
./target/release/polkadot \
--chain ./rococo_local.json \
-d /tmp/relay/bob \
--validator \
--bob \
--port 50556

Add more nodes as needed, with non-conflicting ports, DB directories, and validator keys (--charlie, --dave, etc.).

Reserve a ParaID

To connect to a relay chain, you must first _reserve a ParaId for your parathread that will become a parachain. To do this, you will need sufficient amount of currency on the network account to reserve the ID.

In this example, we will use Charlie development account where we have funds available. Once you submit this extrinsic successfully, you can start your collators.

The easiest way to reserve your ParaId is via Polkadot Apps UI under the Parachains -> Parathreads tab and use the + ParaID button.

Parachain

Select the Correct Relay Chain

To operate your parachain, you need to specify the correct relay chain you will connect to in your collator chain_spec.rs. Specifically you pass the command for the network you need in the Extensions of your ChainSpec::from_genesis() in the code.

Extensions {
    relay_chain: "rococo-local".into(), // You MUST set this to the correct network!
    para_id: id.into(),
},

You can choose from any pre-set runtime chainspec in the Polkadot repo, by referring to the cli/src/command.rs and node/service/src/chain_spec.rs files or generate your own and use that. See the Cumulus Workshop for how.

In the following examples, we will use the rococo-local relay network we setup in the last section.

Export the Parachain Genesis and Runtime

We first generate the genesis state and genesis wasm needed for the parachain registration.

# Build the parachain node (from it's top level dir)
cd substrate-parachain-template
cargo build --release

# Folder to store resource files needed for parachain registration
mkdir -p resources

# Build the chainspec
./target/release/parachain-collator build-spec \
--disable-default-bootnode > ./resources/template-local-plain.json

# Build the raw chainspec file
./target/release/parachain-collator build-spec \
--chain=./resources/template-local-plain.json \
--raw --disable-default-bootnode > ./resources/template-local-raw.json

# Export genesis state to `./resources`, using 2000 as the ParaId
./target/release/parachain-collator export-genesis-state --parachain-id 2000 > ./resources/para-2000-genesis

# Export the genesis wasm
./target/release/parachain-collator export-genesis-wasm > ./resources/para-2000-wasm

NOTE: we have set the para_ID to be 2000 here. This must be unique for all parathreads/chains on the relay chain you register with. You must reserve this first on the relay chain for the testnet or mainnet.

Start a Parachain Node (Collator)

From the parachain template working directory:

# NOTE: this command assumes the chain spec is in a directory named `polkadot`
# that is at the same level of the template working directory. Change as needed.
#
# It also assumes a ParaId of 2000. Change as needed.
./target/release/parachain-collator \
-d /tmp/parachain/alice \
--collator \
--alice \
--force-authoring \
--ws-port 9945 \
--parachain-id 2000 \
-- \
--execution wasm \
--chain ../polkadot/rococo_local.json

Output:

2021-05-30 16:57:39 Parachain Collator Template
2021-05-30 16:57:39 ✌️  version 3.0.0-acce183-x86_64-linux-gnu
2021-05-30 16:57:39 ❤️  by Anonymous, 2017-2021
2021-05-30 16:57:39 📋 Chain specification: Local Testnet
2021-05-30 16:57:39 🏷 Node name: Alice
2021-05-30 16:57:39 👤 Role: AUTHORITY
2021-05-30 16:57:39 💾 Database: RocksDb at /tmp/parachain/alice/chains/local_testnet/db
2021-05-30 16:57:39 ⛓  Native runtime: template-parachain-1 (template-parachain-0.tx1.au1)
2021-05-30 16:57:41 Parachain id: Id(2000)
2021-05-30 16:57:41 Parachain Account: 5Ec4AhPUwPeyTFyuhGuBbD224mY85LKLMSqSSo33JYWCazU4
2021-05-30 16:57:41 Parachain genesis state: 0x0000000000000000000000000000000000000000000000000000000000000000000a96f42b5cb798190e5f679bb16970905087a9a9fc612fb5ca6b982b85783c0d03170a2e7597b7b7e3d84c05391d139a62b157e78786d8c082f29dcf4c11131400
2021-05-30 16:57:41 Is collating: yes
2021-05-30 16:57:41 [Parachain] 🔨 Initializing Genesis block/state (state: 0x0a96…3c0d, header-hash: 0xd42b…f271)
2021-05-30 16:57:41 [Parachain] ⏱  Loaded block-time = 12s from block 0xd42bb78354bc21770e3f0930ed45c7377558d2d8e81ca4d457e573128aabf271
2021-05-30 16:57:43 [Relaychain] 🔨 Initializing Genesis block/state (state: 0xace1…1b62, header-hash: 0xfa68…cf58)
2021-05-30 16:57:43 [Relaychain] 👴 Loading GRANDPA authority set from genesis on what appears to be first startup.
2021-05-30 16:57:44 [Relaychain] ⏱  Loaded block-time = 6s from block 0xfa68f5abd2a80394b87c9bd07e0f4eee781b8c696d0a22c8e5ba38ae10e1cf58
2021-05-30 16:57:44 [Relaychain] 👶 Creating empty BABE epoch changes on what appears to be first startup.
2021-05-30 16:57:44 [Relaychain] 🏷 Local node identity is: 12D3KooWBjYK2W4dsBfsrFA9tZCStb5ogPb6STQqi2AK9awXfXyG
2021-05-30 16:57:44 [Relaychain] 📦 Highest known block at #0
2021-05-30 16:57:44 [Relaychain] 〽️ Prometheus server started at 127.0.0.1:9616
2021-05-30 16:57:44 [Relaychain] Listening for new connections on 127.0.0.1:9945.
2021-05-30 16:57:44 [Parachain] Using default protocol ID "sup" because none is configured in the chain specs
2021-05-30 16:57:44 [Parachain] 🏷 Local node identity is: 12D3KooWADBSC58of6ng2M29YTDkmWCGehHoUZhsy9LGkHgYscBw
2021-05-30 16:57:44 [Parachain] 📦 Highest known block at #0
2021-05-30 16:57:44 [Parachain] Unable to listen on 127.0.0.1:9945
2021-05-30 16:57:44 [Parachain] Unable to bind RPC server to 127.0.0.1:9945. Trying random port.
2021-05-30 16:57:44 [Parachain] Listening for new connections on 127.0.0.1:45141.
2021-05-30 16:57:45 [Relaychain] 🔍 Discovered new external address for our node: /ip4/192.168.42.204/tcp/30334/ws/p2p/12D3KooWBjYK2W4dsBfsrFA9tZCStb5ogPb6STQqi2AK9awXfXyG
2021-05-30 16:57:45 [Parachain] 🔍 Discovered new external address for our node: /ip4/192.168.42.204/tcp/30333/p2p/12D3KooWADBSC58of6ng2M29YTDkmWCGehHoUZhsy9LGkHgYscBw
2021-05-30 16:57:48 [Relaychain] ✨ Imported #8 (0xe60b…9b0a)
2021-05-30 16:57:49 [Relaychain] 💤 Idle (2 peers), best: #8 (0xe60b…9b0a), finalized #5 (0x1e6f…567c), ⬇ 4.5kiB/s ⬆ 2.2kiB/s
2021-05-30 16:57:49 [Parachain] 💤 Idle (0 peers), best: #0 (0xd42b…f271), finalized #0 (0xd42b…f271), ⬇ 2.0kiB/s ⬆ 1.7kiB/s
2021-05-30 16:57:54 [Relaychain] ✨ Imported #9 (0x1af9…c9be)
2021-05-30 16:57:54 [Relaychain] ✨ Imported #9 (0x6ed8…fdf6)
2021-05-30 16:57:54 [Relaychain] 💤 Idle (2 peers), best: #9 (0x1af9…c9be), finalized #6 (0x3319…69a2), ⬇ 1.8kiB/s ⬆ 0.5kiB/s
2021-05-30 16:57:54 [Parachain] 💤 Idle (0 peers), best: #0 (0xd42b…f271), finalized #0 (0xd42b…f271), ⬇ 0.2kiB/s ⬆ 0.2kiB/s
2021-05-30 16:57:59 [Relaychain] 💤 Idle (2 peers), best: #9 (0x1af9…c9be), finalized #7 (0x5b50…1e5b), ⬇ 0.6kiB/s ⬆ 0.4kiB/s
2021-05-30 16:57:59 [Parachain] 💤 Idle (0 peers), best: #0 (0xd42b…f271), finalized #0 (0xd42b…f271), ⬇ 0 ⬆ 0
2021-05-30 16:58:00 [Relaychain] ✨ Imported #10 (0xc9c9…1ca3)

You see messages are from both a relaychain node and a parachain node. This is because a relay chain light client is also run next to the parachain collator.

Parachain Registration

Now that you have two relay chain nodes, and a parachain node accompanied with a relay chain light client running, the next step is to register the parachain in the relay chain with the following steps (for detail, refer to the Substrate Cumulus Worship):

  • Goto Polkadot Apps UI, connecting to your relay chain.
  • Execute a sudo extrinsic on the relay chain by going to Developer -> sudo page.
  • Pick paraSudoWrapper -> sudoScheduleParaInitialize(id, genesis) as the extrinsic type, shown below.

Polkadot Apps UI

  • Set the id: ParaId to 2,000 (or whatever ParaId you used above), and set the parachain: Bool option to Yes.
  • For the genesisHead, drag the genesis state file exported above, para-2000-genesis, in.
  • For the validationCode, drag the genesis wasm file exported above, para-2000-wasm, in.

Note: When registering to the public Rococo testnet, ensure you set a unique paraId larger than 1,000. Values below 1,000 are reserved exclusively for system parachains.

Restart the Parachain (Collator)

The collator node may need to be restarted to get it functioning as expected. After a new epoch starts on the relay chain, your parachain will come online. Once this happens, you should see the collator start reporting parachain blocks:

# Notice the relay epoch change! Only then do we start parachain collating!
#
2021-05-30 17:00:04 [Relaychain] 💤 Idle (2 peers), best: #30 (0xfc02…2a2a), finalized #28 (0x10ff…6539), ⬇ 1.0kiB/s ⬆ 0.3kiB/s
2021-05-30 17:00:04 [Parachain] 💤 Idle (0 peers), best: #0 (0xd42b…f271), finalized #0 (0xd42b…f271), ⬇ 0 ⬆ 0
2021-05-30 17:00:06 [Relaychain] 👶 New epoch 3 launching at block 0x68bc…0605 (block slot 270402601 >= start slot 270402601).
2021-05-30 17:00:06 [Relaychain] 👶 Next epoch starts at slot 270402611
2021-05-30 17:00:06 [Relaychain] ✨ Imported #31 (0x68bc…0605)
2021-05-30 17:00:06 [Parachain] Starting collation. relay_parent=0x68bcc93d24a31a2c89800a56c7a2b275fe9ca7bd63f829b64588ae0d99280605 at=0xd42bb78354bc21770e3f0930ed45c7377558d2d8e81ca4d457e573128aabf271
2021-05-30 17:00:06 [Parachain] 🙌 Starting consensus session on top of parent 0xd42bb78354bc21770e3f0930ed45c7377558d2d8e81ca4d457e573128aabf271
2021-05-30 17:00:06 [Parachain] 🎁 Prepared block for proposing at 1 [hash: 0xf6507812bf60bf53af1311f775aac03869be870df6b0406b2969784d0935cb92; parent_hash: 0xd42b…f271; extrinsics (2): [0x1bf5…1d76, 0x7c9b…4e23]]
2021-05-30 17:00:06 [Parachain] 🔖 Pre-sealed block for proposal at 1. Hash now 0x80fc151d7ccf228b802525022b6de257e42388ec7dc3c1dd7de491313650ccae, previously 0xf6507812bf60bf53af1311f775aac03869be870df6b0406b2969784d0935cb92.
2021-05-30 17:00:06 [Parachain] ✨ Imported #1 (0x80fc…ccae)
2021-05-30 17:00:06 [Parachain] Produced proof-of-validity candidate. block_hash=0x80fc151d7ccf228b802525022b6de257e42388ec7dc3c1dd7de491313650ccae
2021-05-30 17:00:09 [Relaychain] 💤 Idle (2 peers), best: #31 (0x68bc…0605), finalized #29 (0xa6fa…9e16), ⬇ 1.2kiB/s ⬆ 129.9kiB/s
2021-05-30 17:00:09 [Parachain] 💤 Idle (0 peers), best: #0 (0xd42b…f271), finalized #0 (0xd42b…f271), ⬇ 0 ⬆ 0
2021-05-30 17:00:12 [Relaychain] ✨ Imported #32 (0x5e92…ba30)
2021-05-30 17:00:12 [Relaychain] Moving approval window from session 0..=2 to 0..=3
2021-05-30 17:00:12 [Relaychain] ✨ Imported #32 (0x8144…74eb)
2021-05-30 17:00:14 [Relaychain] 💤 Idle (2 peers), best: #32 (0x5e92…ba30), finalized #29 (0xa6fa…9e16), ⬇ 1.4kiB/s ⬆ 0.2kiB/s
2021-05-30 17:00:14 [Parachain] 💤 Idle (0 peers), best: #0 (0xd42b…f271), finalized #0 (0xd42b…f271), ⬇ 0 ⬆ 0
2021-05-30 17:00:18 [Relaychain] ✨ Imported #33 (0x8c30…9ccd)
2021-05-30 17:00:18 [Parachain] Starting collation. relay_parent=0x8c30ce9e6e9867824eb2aff40148ac1ed64cf464f51c5f2574013b44b20f9ccd at=0x80fc151d7ccf228b802525022b6de257e42388ec7dc3c1dd7de491313650ccae
2021-05-30 17:00:19 [Relaychain] 💤 Idle (2 peers), best: #33 (0x8c30…9ccd), finalized #30 (0xfc02…2a2a), ⬇ 0.7kiB/s ⬆ 0.4kiB/s
2021-05-30 17:00:19 [Parachain] 💤 Idle (0 peers), best: #1 (0x80fc…ccae), finalized #0 (0xd42b…f271), ⬇ 0 ⬆ 0
2021-05-30 17:00:22 [Relaychain] 👴 Applying authority set change scheduled at block #31
2021-05-30 17:00:22 [Relaychain] 👴 Applying GRANDPA set change to new set [(Public(88dc3417d5058ec4b4503e0c12ea1a0a89be200fe98922423d4334014fa6b0ee (5FA9nQDV...)), 1), (Public(d17c2d7823ebf260fd138f2d7e27d114c0145d968b5ff5006125f2414fadae69 (5GoNkf6W...)), 1)]
2021-05-30 17:00:22 [Relaychain] 👴 Imported justification for block #31 that triggers command Changing authorities, signaling voter.
2021-05-30 17:00:24 [Relaychain] ✨ Imported #34 (0x211b…febf)
2021-05-30 17:00:24 [Parachain] Starting collation. relay_parent=0x211b3c53bebeff8af05e8f283d59fe171b7f91a5bf9c4669d88943f5a42bfebf at=0x80fc151d7ccf228b802525022b6de257e42388ec7dc3c1dd7de491313650ccae
2021-05-30 17:00:24 [Parachain] 🙌 Starting consensus session on top of parent 0x80fc151d7ccf228b802525022b6de257e42388ec7dc3c1dd7de491313650ccae
2021-05-30 17:00:24 [Parachain] 🎁 Prepared block for proposing at 2 [hash: 0x10fcb3180e966729c842d1b0c4d8d2c4028cfa8bef02b909af5ef787e6a6a694; parent_hash: 0x80fc…ccae; extrinsics (2): [0x4a6c…1fc6, 0x6b84…7cea]]
2021-05-30 17:00:24 [Parachain] 🔖 Pre-sealed block for proposal at 2. Hash now 0x5087fd06b1b73d90cfc3ad175df8495b378fffbb02fea212cc9e49a00fd8b5a0, previously 0x10fcb3180e966729c842d1b0c4d8d2c4028cfa8bef02b909af5ef787e6a6a694.
2021-05-30 17:00:24 [Parachain] ✨ Imported #2 (0x5087…b5a0)
2021-05-30 17:00:24 [Parachain] Produced proof-of-validity candidate. block_hash=0x5087fd06b1b73d90cfc3ad175df8495b378fffbb02fea212cc9e49a00fd8b5a0
2021-05-30 17:00:24 [Relaychain] 💤 Idle (2 peers), best: #34 (0x211b…febf), finalized #31 (0x68bc…0605), ⬇ 1.0kiB/s ⬆ 130.1kiB/s
2021-05-30 17:00:24 [Parachain] 💤 Idle (0 peers), best: #1 (0x80fc…ccae), finalized #0 (0xd42b…f271), ⬇ 0 ⬆ 0
2021-05-30 17:00:29 [Relaychain] 💤 Idle (2 peers), best: #34 (0x211b…febf), finalized #32 (0x5e92…ba30), ⬇ 0.2kiB/s ⬆ 0.1kiB/s
2021-05-30 17:00:29 [Parachain] 💤 Idle (0 peers), best: #1 (0x80fc…ccae), finalized #0 (0xd42b…f271), ⬇ 0 ⬆ 0
2021-05-30 17:00:30 [Relaychain] ✨ Imported #35 (0xee07…38a0)
2021-05-30 17:00:34 [Relaychain] 💤 Idle (2 peers), best: #35 (0xee07…38a0), finalized #33 (0x8c30…9ccd), ⬇ 0.9kiB/s ⬆ 0.3kiB/s
2021-05-30 17:00:34 [Parachain] 💤 Idle (0 peers), best: #1 (0x80fc…ccae), finalized #1 (0x80fc…ccae), ⬇ 0 ⬆ 0
2021-05-30 17:00:36 [Relaychain] ✨ Imported #36 (0xe8ce…4af6)
2021-05-30 17:00:36 [Parachain] Starting collation. relay_parent=0xe8cec8015c0c7bf508bf3f2f82b1696e9cca078e814b0f6671f0b0d5dfe84af6 at=0x5087fd06b1b73d90cfc3ad175df8495b378fffbb02fea212cc9e49a00fd8b5a0
2021-05-30 17:00:39 [Relaychain] 💤 Idle (2 peers), best: #36 (0xe8ce…4af6), finalized #33 (0x8c30…9ccd), ⬇ 0.6kiB/s ⬆ 0.1kiB/s
2021-05-30 17:00:39 [Parachain] 💤 Idle (0 peers), best: #2 (0x5087…b5a0), finalized #1 (0x80fc…ccae), ⬇ 0 ⬆ 0

Note the delay here! It may take some time for your relay chain to enter a new epoch.

Rococo & Westend Relay Chain Testnets

Is this Cumulus Parachain Template Rococo & Westend testnets compatible? Yes!

  • Rococo is the testnet of Kusama (join the Rococo Faucet to get testing funds).
  • Westend is the testnet of Polkadot (join the Westend Faucet to get testing funds).

See the Cumulus Workshop for the latest instructions to register a parathread/parachain on a relay chain.

NOTE: When running the relay chain and parachain, you must use the same tagged version of Polkadot and Cumulus so the collator would register successfully to the relay chain. You should test locally registering your parachain successfully before attempting to connect to any running relay chain network!

Find chainspec files to connect to live networks here. You want to be sure to use the correct git release tag in these files, as they change from time to time and must match the live network!

These networks are under constant development - so please follow the progress and update of your parachains in lock step with the testnet changes if you wish to connect to the network. Do join the Parachain Technical matrix chat room to ask questions and connect with the parachain building teams.

Learn More

  • More detailed instructions to use Cumulus parachains are found in the Cumulus Workshop.
  • Refer to the upstream Substrate Node Template to learn more about the structure of this project, the capabilities it encapsulates and the way in which those capabilities are implemented.
  • Learn more about how a parachain block is added to a finalized chain here.

Download Details:
Author: aresprotocols
Source Code: https://github.com/aresprotocols/substrate-parachain-template
License: Unlicense License

#rust  #blockchain #substrate #parachain #polkadot 

Plpgsql Check: Extension That Allows to Check Plpgsql Source Code.

plpgsql_check

I founded this project, because I wanted to publish the code I wrote in the last two years, when I tried to write enhanced checking for PostgreSQL upstream. It was not fully successful - integration into upstream requires some larger plpgsql refactoring - probably it will not be done in next years (now is Dec 2013). But written code is fully functional and can be used in production (and it is used in production). So, I created this extension to be available for all plpgsql developers.

If you like it and if you would to join to development of this extension, register yourself to postgresql extension hacking google group.

Features

  • check fields of referenced database objects and types inside embedded SQL
  • using correct types of function parameters
  • unused variables and function argumens, unmodified OUT argumens
  • partially detection of dead code (due RETURN command)
  • detection of missing RETURN command in function
  • try to identify unwanted hidden casts, that can be performance issue like unused indexes
  • possibility to collect relations and functions used by function
  • possibility to check EXECUTE stmt agaist SQL injection vulnerability

I invite any ideas, patches, bugreports.

plpgsql_check is next generation of plpgsql_lint. It allows to check source code by explicit call plpgsql_check_function.

PostgreSQL PostgreSQL 10, 11, 12, 13 and 14 are supported.

The SQL statements inside PL/pgSQL functions are checked by validator for semantic errors. These errors can be found by plpgsql_check_function:

Active mode

postgres=# CREATE EXTENSION plpgsql_check;
LOAD
postgres=# CREATE TABLE t1(a int, b int);
CREATE TABLE

postgres=#
CREATE OR REPLACE FUNCTION public.f1()
RETURNS void
LANGUAGE plpgsql
AS $function$
DECLARE r record;
BEGIN
  FOR r IN SELECT * FROM t1
  LOOP
    RAISE NOTICE '%', r.c; -- there is bug - table t1 missing "c" column
  END LOOP;
END;
$function$;

CREATE FUNCTION

postgres=# select f1(); -- execution doesn't find a bug due to empty table t1
  f1 
 ────
   
 (1 row)

postgres=# \x
Expanded display is on.
postgres=# select * from plpgsql_check_function_tb('f1()');
─[ RECORD 1 ]───────────────────────────
functionid │ f1
lineno     │ 6
statement  │ RAISE
sqlstate   │ 42703
message    │ record "r" has no field "c"
detail     │ [null]
hint       │ [null]
level      │ error
position   │ 0
query      │ [null]

postgres=# \sf+ f1
    CREATE OR REPLACE FUNCTION public.f1()
     RETURNS void
     LANGUAGE plpgsql
1       AS $function$
2       DECLARE r record;
3       BEGIN
4         FOR r IN SELECT * FROM t1
5         LOOP
6           RAISE NOTICE '%', r.c; -- there is bug - table t1 missing "c" column
7         END LOOP;
8       END;
9       $function$

Function plpgsql_check_function() has three possible formats: text, json or xml

select * from plpgsql_check_function('f1()', fatal_errors := false);
                         plpgsql_check_function                         
------------------------------------------------------------------------
 error:42703:4:SQL statement:column "c" of relation "t1" does not exist
 Query: update t1 set c = 30
 --                   ^
 error:42P01:7:RAISE:missing FROM-clause entry for table "r"
 Query: SELECT r.c
 --            ^
 error:42601:7:RAISE:too few parameters specified for RAISE
(7 rows)

postgres=# select * from plpgsql_check_function('fx()', format:='xml');
                 plpgsql_check_function                     
────────────────────────────────────────────────────────────────
 <Function oid="16400">                                        ↵
   <Issue>                                                     ↵
     <Level>error</level>                                      ↵
     <Sqlstate>42P01</Sqlstate>                                ↵
     <Message>relation "foo111" does not exist</Message>       ↵
     <Stmt lineno="3">RETURN</Stmt>                            ↵
     <Query position="23">SELECT (select a from foo111)</Query>↵
   </Issue>                                                    ↵
  </Function>
 (1 row)

Arguments

You can set level of warnings via function's parameters:

Mandatory arguments

  • function name or function signature - these functions requires function specification. Any function in PostgreSQL can be specified by Oid or by name or by signature. When you know oid or complete function's signature, you can use a regprocedure type parameter like 'fx()'::regprocedure or 16799::regprocedure. Possible alternative is using a name only, when function's name is unique - like 'fx'. When the name is not unique or the function doesn't exists it raises a error.

Optional arguments

relid DEFAULT 0 - oid of relation assigned with trigger function. It is necessary for check of any trigger function.

fatal_errors boolean DEFAULT true - stop on first error

other_warnings boolean DEFAULT true - show warnings like different attributes number in assignmenet on left and right side, variable overlaps function's parameter, unused variables, unwanted casting, ..

extra_warnings boolean DEFAULT true - show warnings like missing RETURN, shadowed variables, dead code, never read (unused) function's parameter, unmodified variables, modified auto variables, ..

performance_warnings boolean DEFAULT false - performance related warnings like declared type with type modificator, casting, implicit casts in where clause (can be reason why index is not used), ..

security_warnings boolean DEFAULT false - security related checks like SQL injection vulnerability detection

anyelementtype regtype DEFAULT 'int' - a real type used instead anyelement type

anyenumtype regtype DEFAULT '-' - a real type used instead anyenum type

anyrangetype regtype DEFAULT 'int4range' - a real type used instead anyrange type

anycompatibletype DEFAULT 'int' - a real type used instead anycompatible type

anycompatiblerangetype DEFAULT 'int4range' - a real type used instead anycompatible range type

without_warnings DEFAULT false - disable all warnings

all_warnings DEFAULT false - enable all warnings

newtable DEFAULT NULL, oldtable DEFAULT NULL - the names of NEW or OLD transitive tables. These parameters are required when transitive tables are used.

Triggers

When you want to check any trigger, you have to enter a relation that will be used together with trigger function

CREATE TABLE bar(a int, b int);

postgres=# \sf+ foo_trg
    CREATE OR REPLACE FUNCTION public.foo_trg()
         RETURNS trigger
         LANGUAGE plpgsql
1       AS $function$
2       BEGIN
3         NEW.c := NEW.a + NEW.b;
4         RETURN NEW;
5       END;
6       $function$

Missing relation specification

postgres=# select * from plpgsql_check_function('foo_trg()');
ERROR:  missing trigger relation
HINT:  Trigger relation oid must be valid

Correct trigger checking (with specified relation)

postgres=# select * from plpgsql_check_function('foo_trg()', 'bar');
                 plpgsql_check_function                 
--------------------------------------------------------
 error:42703:3:assignment:record "new" has no field "c"
(1 row)

For triggers with transitive tables you can set a oldtable or newtable parameters:

create or replace function footab_trig_func()
returns trigger as $$
declare x int;
begin
  if false then
    -- should be ok;
    select count(*) from newtab into x; 

    -- should fail;
    select count(*) from newtab where d = 10 into x;
  end if;
  return null;
end;
$$ language plpgsql;

select * from plpgsql_check_function('footab_trig_func','footab', newtable := 'newtab');

Mass check

You can use the plpgsql_check_function for mass check functions and mass check triggers. Please, test following queries:

-- check all nontrigger plpgsql functions
SELECT p.oid, p.proname, plpgsql_check_function(p.oid)
   FROM pg_catalog.pg_namespace n
   JOIN pg_catalog.pg_proc p ON pronamespace = n.oid
   JOIN pg_catalog.pg_language l ON p.prolang = l.oid
  WHERE l.lanname = 'plpgsql' AND p.prorettype <> 2279;

or

SELECT p.proname, tgrelid::regclass, cf.*
   FROM pg_proc p
        JOIN pg_trigger t ON t.tgfoid = p.oid 
        JOIN pg_language l ON p.prolang = l.oid
        JOIN pg_namespace n ON p.pronamespace = n.oid,
        LATERAL plpgsql_check_function(p.oid, t.tgrelid) cf
  WHERE n.nspname = 'public' and l.lanname = 'plpgsql'

or

-- check all plpgsql functions (functions or trigger functions with defined triggers)
SELECT
    (pcf).functionid::regprocedure, (pcf).lineno, (pcf).statement,
    (pcf).sqlstate, (pcf).message, (pcf).detail, (pcf).hint, (pcf).level,
    (pcf)."position", (pcf).query, (pcf).context
FROM
(
    SELECT
        plpgsql_check_function_tb(pg_proc.oid, COALESCE(pg_trigger.tgrelid, 0)) AS pcf
    FROM pg_proc
    LEFT JOIN pg_trigger
        ON (pg_trigger.tgfoid = pg_proc.oid)
    WHERE
        prolang = (SELECT lang.oid FROM pg_language lang WHERE lang.lanname = 'plpgsql') AND
        pronamespace <> (SELECT nsp.oid FROM pg_namespace nsp WHERE nsp.nspname = 'pg_catalog') AND
        -- ignore unused triggers
        (pg_proc.prorettype <> (SELECT typ.oid FROM pg_type typ WHERE typ.typname = 'trigger') OR
         pg_trigger.tgfoid IS NOT NULL)
    OFFSET 0
) ss
ORDER BY (pcf).functionid::regprocedure::text, (pcf).lineno

Passive mode

Functions should be checked on start - plpgsql_check module must be loaded.

Configuration

plpgsql_check.mode = [ disabled | by_function | fresh_start | every_start ]
plpgsql_check.fatal_errors = [ yes | no ]

plpgsql_check.show_nonperformance_warnings = false
plpgsql_check.show_performance_warnings = false

Default mode is by_function, that means that the enhanced check is done only in active mode - by plpgsql_check_function. fresh_start means cold start.

You can enable passive mode by

load 'plpgsql'; -- 1.1 and higher doesn't need it
load 'plpgsql_check';
set plpgsql_check.mode = 'every_start';

SELECT fx(10); -- run functions - function is checked before runtime starts it

Limits

plpgsql_check should find almost all errors on really static code. When developer use some PLpgSQL's dynamic features like dynamic SQL or record data type, then false positives are possible. These should be rare - in well written code - and then the affected function should be redesigned or plpgsql_check should be disabled for this function.

CREATE OR REPLACE FUNCTION f1()
RETURNS void AS $$
DECLARE r record;
BEGIN
  FOR r IN EXECUTE 'SELECT * FROM t1'
  LOOP
    RAISE NOTICE '%', r.c;
  END LOOP;
END;
$$ LANGUAGE plpgsql SET plpgsql.enable_check TO false;

A usage of plpgsql_check adds a small overhead (in enabled passive mode) and you should use it only in develop or preprod environments.

Dynamic SQL

This module doesn't check queries that are assembled in runtime. It is not possible to identify results of dynamic queries - so plpgsql_check cannot to set correct type to record variables and cannot to check a dependent SQLs and expressions.

When type of record's variable is not know, you can assign it explicitly with pragma type:

DECLARE r record;
BEGIN
  EXECUTE format('SELECT * FROM %I', _tablename) INTO r;
  PERFORM plpgsql_check_pragma('type: r (id int, processed bool)');
  IF NOT r.processed THEN
    ...

Attention: The SQL injection check can detect only some SQL injection vulnerabilities. This tool cannot be used for security audit! Some issues should not be detected. This check can raise false alarms too - probably when variable is sanitized by other command or when value is of some compose type. 

Refcursors

plpgsql_check should not to detect structure of referenced cursors. A reference on cursor in PLpgSQL is implemented as name of global cursor. In check time, the name is not known (not in all possibilities), and global cursor doesn't exist. It is significant break for any static analyse. PLpgSQL cannot to set correct type for record variables and cannot to check a dependent SQLs and expressions. A solution is same like dynamic SQL. Don't use record variable as target when you use refcursor type or disable plpgsql_check for these functions.

CREATE OR REPLACE FUNCTION foo(refcur_var refcursor)
RETURNS void AS $$
DECLARE
  rec_var record;
BEGIN
  FETCH refcur_var INTO rec_var; -- this is STOP for plpgsql_check
  RAISE NOTICE '%', rec_var;     -- record rec_var is not assigned yet error

In this case a record type should not be used (use known rowtype instead):

CREATE OR REPLACE FUNCTION foo(refcur_var refcursor)
RETURNS void AS $$
DECLARE
  rec_var some_rowtype;
BEGIN
  FETCH refcur_var INTO rec_var;
  RAISE NOTICE '%', rec_var;

Temporary tables

plpgsql_check cannot verify queries over temporary tables that are created in plpgsql's function runtime. For this use case it is necessary to create a fake temp table or disable plpgsql_check for this function.

In reality temp tables are stored in own (per user) schema with higher priority than persistent tables. So you can do (with following trick safetly):

CREATE OR REPLACE FUNCTION public.disable_dml()
RETURNS trigger
LANGUAGE plpgsql AS $function$
BEGIN
  RAISE EXCEPTION SQLSTATE '42P01'
     USING message = format('this instance of %I table doesn''t allow any DML operation', TG_TABLE_NAME),
           hint = format('you should to run "CREATE TEMP TABLE %1$I(LIKE %1$I INCLUDING ALL);" statement',
                         TG_TABLE_NAME);
  RETURN NULL;
END;
$function$;

CREATE TABLE foo(a int, b int); -- doesn't hold data ever
CREATE TRIGGER foo_disable_dml
   BEFORE INSERT OR UPDATE OR DELETE ON foo
   EXECUTE PROCEDURE disable_dml();

postgres=# INSERT INTO  foo VALUES(10,20);
ERROR:  this instance of foo table doesn't allow any DML operation
HINT:  you should to run "CREATE TEMP TABLE foo(LIKE foo INCLUDING ALL);" statement
postgres=# 

CREATE TABLE
postgres=# INSERT INTO  foo VALUES(10,20);
INSERT 0 1

This trick emulates GLOBAL TEMP tables partially and it allows a statical validation. Other possibility is using a [template foreign data wrapper] (https://github.com/okbob/template_fdw)

You can use pragma table and create ephemeral table:

BEGIN
   CREATE TEMP TABLE xxx(a int);
   PERFORM plpgsql_check_pragma('table: xxx(a int)');
   INSERT INTO xxx VALUES(10);

Dependency list

A function plpgsql_show_dependency_tb can show all functions, operators and relations used inside processed function:

postgres=# select * from plpgsql_show_dependency_tb('testfunc(int,float)');
┌──────────┬───────┬────────┬─────────┬────────────────────────────┐
│   type   │  oid  │ schema │  name   │           params           │
╞══════════╪═══════╪════════╪═════════╪════════════════════════════╡
│ FUNCTION │ 36008 │ public │ myfunc1 │ (integer,double precision) │
│ FUNCTION │ 35999 │ public │ myfunc2 │ (integer,double precision) │
│ OPERATOR │ 36007 │ public │ **      │ (integer,integer)          │
│ RELATION │ 36005 │ public │ myview  │                            │
│ RELATION │ 36002 │ public │ mytable │                            │
└──────────┴───────┴────────┴─────────┴────────────────────────────┘
(4 rows)

Profiler

The plpgsql_check contains simple profiler of plpgsql functions and procedures. It can work with/without a access to shared memory. It depends on shared_preload_libraries config. When plpgsql_check was initialized by shared_preload_libraries, then it can allocate shared memory, and function's profiles are stored there. When plpgsql_check cannot to allocate shared momory, the profile is stored in session memory.

Due dependencies, shared_preload_libraries should to contains plpgsql first

postgres=# show shared_preload_libraries ;
┌──────────────────────────┐
│ shared_preload_libraries │
╞══════════════════════════╡
│ plpgsql,plpgsql_check    │
└──────────────────────────┘
(1 row)

The profiler is active when GUC plpgsql_check.profiler is on. The profiler doesn't require shared memory, but if there are not shared memory, then the profile is limmitted just to active session.

When plpgsql_check is initialized by shared_preload_libraries, another GUC is available to configure the amount of shared memory used by the profiler: plpgsql_check.profiler_max_shared_chunks. This defines the maximum number of statements chunk that can be stored in shared memory. For each plpgsql function (or procedure), the whole content is split into chunks of 30 statements. If needed, multiple chunks can be used to store the whole content of a single function. A single chunk is 1704 bytes. The default value for this GUC is 15000, which should be enough for big projects containing hundred of thousands of statements in plpgsql, and will consume about 24MB of memory. If your project doesn't require that much number of chunks, you can set this parameter to a smaller number in order to decrease the memory usage. The minimum value is 50 (which should consume about 83kB of memory), and the maximum value is 100000 (which should consume about 163MB of memory). Changing this parameter requires a PostgreSQL restart.

The profiler will also retrieve the query identifier for each instruction that contains an expression or optimizable statement. Note that this requires pg_stat_statements, or another similar third-party extension), to be installed. There are some limitations to the query identifier retrieval:

  • if a plpgsql expression contains underlying statements, only the top level query identifier will be retrieved
  • the profiler doesn't compute query identifier by itself but relies on external extension, such as pg_stat_statements, for that. It means that depending on the external extension behavior, you may not be able to see a query identifier for some statements. That's for instance the case with DDL statements, as pg_stat_statements doesn't expose the query identifier for such queries.
  • a query identifier is retrieved only for instructions containing expressions. This means that plpgsql_profiler_function_tb() function can report less query identifier than instructions on a single line.

Attention: A update of shared profiles can decrease performance on servers under higher load.

The profile can be displayed by function plpgsql_profiler_function_tb:

postgres=# select lineno, avg_time, source from plpgsql_profiler_function_tb('fx(int)');
┌────────┬──────────┬───────────────────────────────────────────────────────────────────┐
│ lineno │ avg_time │                              source                               │
╞════════╪══════════╪═══════════════════════════════════════════════════════════════════╡
│      1 │          │                                                                   │
│      2 │          │ declare result int = 0;                                           │
│      3 │    0.075 │ begin                                                             │
│      4 │    0.202 │   for i in 1..$1 loop                                             │
│      5 │    0.005 │     select result + i into result; select result + i into result; │
│      6 │          │   end loop;                                                       │
│      7 │        0 │   return result;                                                  │
│      8 │          │ end;                                                              │
└────────┴──────────┴───────────────────────────────────────────────────────────────────┘
(9 rows)

The profile per statements (not per line) can be displayed by function plpgsql_profiler_function_statements_tb:

        CREATE OR REPLACE FUNCTION public.fx1(a integer)
         RETURNS integer
         LANGUAGE plpgsql
1       AS $function$
2       begin
3         if a > 10 then
4           raise notice 'ahoj';
5           return -1;
6         else
7           raise notice 'nazdar';
8           return 1;
9         end if;
10      end;
11      $function$

postgres=# select stmtid, parent_stmtid, parent_note, lineno, exec_stmts, stmtname
             from plpgsql_profiler_function_statements_tb('fx1');
┌────────┬───────────────┬─────────────┬────────┬────────────┬─────────────────┐
│ stmtid │ parent_stmtid │ parent_note │ lineno │ exec_stmts │    stmtname     │
╞════════╪═══════════════╪═════════════╪════════╪════════════╪═════════════════╡
│      0 │             ∅ │ ∅           │      2 │          0 │ statement block │
│      1 │             0 │ body        │      3 │          0 │ IF              │
│      2 │             1 │ then body   │      4 │          0 │ RAISE           │
│      3 │             1 │ then body   │      5 │          0 │ RETURN          │
│      4 │             1 │ else body   │      7 │          0 │ RAISE           │
│      5 │             1 │ else body   │      8 │          0 │ RETURN          │
└────────┴───────────────┴─────────────┴────────┴────────────┴─────────────────┘
(6 rows)

All stored profiles can be displayed by calling function plpgsql_profiler_functions_all:

postgres=# select * from plpgsql_profiler_functions_all();
┌───────────────────────┬────────────┬────────────┬──────────┬─────────────┬──────────┬──────────┐
│        funcoid        │ exec_count │ total_time │ avg_time │ stddev_time │ min_time │ max_time │
╞═══════════════════════╪════════════╪════════════╪══════════╪═════════════╪══════════╪══════════╡
│ fxx(double precision) │          1 │       0.01 │     0.01 │        0.00 │     0.01 │     0.01 │
└───────────────────────┴────────────┴────────────┴──────────┴─────────────┴──────────┴──────────┘
(1 row)

There are two functions for cleaning stored profiles: plpgsql_profiler_reset_all() and plpgsql_profiler_reset(regprocedure).

Coverage metrics

plpgsql_check provides two functions:

  • plpgsql_coverage_statements(name)
  • plpgsql_coverage_branches(name)

Note

There is another very good PLpgSQL profiler - https://bitbucket.org/openscg/plprofiler

My extension is designed to be simple for use and practical. Nothing more or less.

plprofiler is more complex. It build call graphs and from this graph it can creates flame graph of execution times.

Both extensions can be used together with buildin PostgreSQL's feature - tracking functions.

set track_functions to 'pl';
...
select * from pg_stat_user_functions;

Tracer

plpgsql_check provides a tracing possibility - in this mode you can see notices on start or end functions (terse and default verbosity) and start or end statements (verbose verbosity). For default and verbose verbosity the content of function arguments is displayed. The content of related variables are displayed when verbosity is verbose.

postgres=# do $$ begin perform fx(10,null, 'now', e'stěhule'); end; $$;
NOTICE:  #0 ->> start of inline_code_block (Oid=0)
NOTICE:  #2   ->> start of function fx(integer,integer,date,text) (Oid=16405)
NOTICE:  #2        call by inline_code_block line 1 at PERFORM
NOTICE:  #2       "a" => '10', "b" => null, "c" => '2020-08-03', "d" => 'stěhule'
NOTICE:  #4     ->> start of function fx(integer) (Oid=16404)
NOTICE:  #4          call by fx(integer,integer,date,text) line 1 at PERFORM
NOTICE:  #4         "a" => '10'
NOTICE:  #4     <<- end of function fx (elapsed time=0.098 ms)
NOTICE:  #2   <<- end of function fx (elapsed time=0.399 ms)
NOTICE:  #0 <<- end of block (elapsed time=0.754 ms)

The number after # is a execution frame counter (this number is related to deep of error context stack). It allows to pair start end and of function.

Tracing is enabled by setting plpgsql_check.tracer to on. Attention - enabling this behaviour has significant negative impact on performance (unlike the profiler). You can set a level for output used by tracer plpgsql_check.tracer_errlevel (default is notice). The output content is limited by length specified by plpgsql_check.tracer_variable_max_length configuration variable.

In terse verbose mode the output is reduced:

postgres=# set plpgsql_check.tracer_verbosity TO terse;
SET
postgres=# do $$ begin perform fx(10,null, 'now', e'stěhule'); end; $$;
NOTICE:  #0 start of inline code block (oid=0)
NOTICE:  #2 start of fx (oid=16405)
NOTICE:  #4 start of fx (oid=16404)
NOTICE:  #4 end of fx
NOTICE:  #2 end of fx
NOTICE:  #0 end of inline code block

In verbose mode the output is extended about statement details:

postgres=# do $$ begin perform fx(10,null, 'now', e'stěhule'); end; $$;
NOTICE:  #0            ->> start of block inline_code_block (oid=0)
NOTICE:  #0.1       1  --> start of PERFORM
NOTICE:  #2              ->> start of function fx(integer,integer,date,text) (oid=16405)
NOTICE:  #2                   call by inline_code_block line 1 at PERFORM
NOTICE:  #2                  "a" => '10', "b" => null, "c" => '2020-08-04', "d" => 'stěhule'
NOTICE:  #2.1       1    --> start of PERFORM
NOTICE:  #2.1                "a" => '10'
NOTICE:  #4                ->> start of function fx(integer) (oid=16404)
NOTICE:  #4                     call by fx(integer,integer,date,text) line 1 at PERFORM
NOTICE:  #4                    "a" => '10'
NOTICE:  #4.1       6      --> start of assignment
NOTICE:  #4.1                  "a" => '10', "b" => '20'
NOTICE:  #4.1              <-- end of assignment (elapsed time=0.076 ms)
NOTICE:  #4.1                  "res" => '130'
NOTICE:  #4.2       7      --> start of RETURN
NOTICE:  #4.2                  "res" => '130'
NOTICE:  #4.2              <-- end of RETURN (elapsed time=0.054 ms)
NOTICE:  #4                <<- end of function fx (elapsed time=0.373 ms)
NOTICE:  #2.1            <-- end of PERFORM (elapsed time=0.589 ms)
NOTICE:  #2              <<- end of function fx (elapsed time=0.727 ms)
NOTICE:  #0.1          <-- end of PERFORM (elapsed time=1.147 ms)
NOTICE:  #0            <<- end of block (elapsed time=1.286 ms)

Special feature of tracer is tracing of ASSERT statement when plpgsql_check.trace_assert is on. When plpgsql_check.trace_assert_verbosity is DEFAULT, then all function's or procedure's variables are displayed when assert expression is false. When this configuration is VERBOSE then all variables from all plpgsql frames are displayed. This behaviour is independent on plpgsql.check_asserts value. It can be used, although the assertions are disabled in plpgsql runtime.

postgres=# set plpgsql_check.tracer to off;
postgres=# set plpgsql_check.trace_assert_verbosity TO verbose;

postgres=# do $$ begin perform fx(10,null, 'now', e'stěhule'); end; $$;
NOTICE:  #4 PLpgSQL assert expression (false) on line 12 of fx(integer) is false
NOTICE:   "a" => '10', "res" => null, "b" => '20'
NOTICE:  #2 PL/pgSQL function fx(integer,integer,date,text) line 1 at PERFORM
NOTICE:   "a" => '10', "b" => null, "c" => '2020-08-05', "d" => 'stěhule'
NOTICE:  #0 PL/pgSQL function inline_code_block line 1 at PERFORM
ERROR:  assertion failed
CONTEXT:  PL/pgSQL function fx(integer) line 12 at ASSERT
SQL statement "SELECT fx(a)"
PL/pgSQL function fx(integer,integer,date,text) line 1 at PERFORM
SQL statement "SELECT fx(10,null, 'now', e'stěhule')"
PL/pgSQL function inline_code_block line 1 at PERFORM

postgres=# set plpgsql.check_asserts to off;
SET
postgres=# do $$ begin perform fx(10,null, 'now', e'stěhule'); end; $$;
NOTICE:  #4 PLpgSQL assert expression (false) on line 12 of fx(integer) is false
NOTICE:   "a" => '10', "res" => null, "b" => '20'
NOTICE:  #2 PL/pgSQL function fx(integer,integer,date,text) line 1 at PERFORM
NOTICE:   "a" => '10', "b" => null, "c" => '2020-08-05', "d" => 'stěhule'
NOTICE:  #0 PL/pgSQL function inline_code_block line 1 at PERFORM
DO

Attention - SECURITY

Tracer prints content of variables or function arguments. For security definer function, this content can hold security sensitive data. This is reason why tracer is disabled by default and should be enabled only with super user rights plpgsql_check.enable_tracer.

Pragma

You can configure plpgsql_check behave inside checked function with "pragma" function. This is a analogy of PL/SQL or ADA language of PRAGMA feature. PLpgSQL doesn't support PRAGMA, but plpgsql_check detects function named plpgsql_check_pragma and get options from parameters of this function. These plpgsql_check options are valid to end of group of statements.

CREATE OR REPLACE FUNCTION test()
RETURNS void AS $$
BEGIN
  ...
  -- for following statements disable check
  PERFORM plpgsql_check_pragma('disable:check');
  ...
  -- enable check again
  PERFORM plpgsql_check_pragma('enable:check');
  ...
END;
$$ LANGUAGE plpgsql;

The function plpgsql_check_pragma is immutable function that returns one. It is defined by plpgsql_check extension. You can declare alternative plpgsql_check_pragma function like:

CREATE OR REPLACE FUNCTION plpgsql_check_pragma(VARIADIC args[])
RETURNS int AS $$
SELECT 1
$$ LANGUAGE sql IMMUTABLE;

Using pragma function in declaration part of top block sets options on function level too.

CREATE OR REPLACE FUNCTION test()
RETURNS void AS $$
DECLARE
  aux int := plpgsql_check_pragma('disable:extra_warnings');
  ...

Shorter syntax for pragma is supported too:

CREATE OR REPLACE FUNCTION test()
RETURNS void AS $$
DECLARE r record;
BEGIN
  PERFORM 'PRAGMA:TYPE:r (a int, b int)';
  PERFORM 'PRAGMA:TABLE: x (like pg_class)';
  ...

Supported pragmas

echo:str - print string (for testing)

status:check,status:tracer, status:other_warnings, status:performance_warnings, status:extra_warnings,status:security_warnings

enable:check,enable:tracer, enable:other_warnings, enable:performance_warnings, enable:extra_warnings,enable:security_warnings

disable:check,disable:tracer, disable:other_warnings, disable:performance_warnings, disable:extra_warnings,disable:security_warnings

type:varname typename or type:varname (fieldname type, ...) - set type to variable of record type

table: name (column_name type, ...) or table: name (like tablename) - create ephereal table

Pragmas enable:tracer and disable:tracerare active for Postgres 12 and higher

Compilation

You need a development environment for PostgreSQL extensions:

make clean
make install

result:

[pavel@localhost plpgsql_check]$ make USE_PGXS=1 clean
rm -f plpgsql_check.so   libplpgsql_check.a  libplpgsql_check.pc
rm -f plpgsql_check.o
rm -rf results/ regression.diffs regression.out tmp_check/ log/
[pavel@localhost plpgsql_check]$ make USE_PGXS=1 all
clang -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fpic -I/usr/local/pgsql/lib/pgxs/src/makefiles/../../src/pl/plpgsql/src -I. -I./ -I/usr/local/pgsql/include/server -I/usr/local/pgsql/include/internal -D_GNU_SOURCE   -c -o plpgsql_check.o plpgsql_check.c
clang -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fpic -I/usr/local/pgsql/lib/pgxs/src/makefiles/../../src/pl/plpgsql/src -shared -o plpgsql_check.so plpgsql_check.o -L/usr/local/pgsql/lib -Wl,--as-needed -Wl,-rpath,'/usr/local/pgsql/lib',--enable-new-dtags  
[pavel@localhost plpgsql_check]$ su root
Password: *******
[root@localhost plpgsql_check]# make USE_PGXS=1 install
/usr/bin/mkdir -p '/usr/local/pgsql/lib'
/usr/bin/mkdir -p '/usr/local/pgsql/share/extension'
/usr/bin/mkdir -p '/usr/local/pgsql/share/extension'
/usr/bin/install -c -m 755  plpgsql_check.so '/usr/local/pgsql/lib/plpgsql_check.so'
/usr/bin/install -c -m 644 plpgsql_check.control '/usr/local/pgsql/share/extension/'
/usr/bin/install -c -m 644 plpgsql_check--0.9.sql '/usr/local/pgsql/share/extension/'
[root@localhost plpgsql_check]# exit
[pavel@localhost plpgsql_check]$ make USE_PGXS=1 installcheck
/usr/local/pgsql/lib/pgxs/src/makefiles/../../src/test/regress/pg_regress --inputdir=./ --psqldir='/usr/local/pgsql/bin'    --dbname=pl_regression --load-language=plpgsql --dbname=contrib_regression plpgsql_check_passive plpgsql_check_active plpgsql_check_active-9.5
(using postmaster on Unix socket, default port)
============== dropping database "contrib_regression" ==============
DROP DATABASE
============== creating database "contrib_regression" ==============
CREATE DATABASE
ALTER DATABASE
============== installing plpgsql                     ==============
CREATE LANGUAGE
============== running regression test queries        ==============
test plpgsql_check_passive    ... ok
test plpgsql_check_active     ... ok
test plpgsql_check_active-9.5 ... ok

=====================
 All 3 tests passed. 
=====================

Compilation on Ubuntu

Sometimes successful compilation can require libicu-dev package (PostgreSQL 10 and higher - when pg was compiled with ICU support)

sudo apt install libicu-dev

Compilation plpgsql_check on Windows

You can check precompiled dll libraries http://okbob.blogspot.cz/2015/02/plpgsqlcheck-is-available-for-microsoft.html

or compile by self:

  1. Download and install PostgreSQL for Win32 from http://www.enterprisedb.com
  2. Download and install Microsoft Visual C++ Express
  3. Lern tutorial http://blog.2ndquadrant.com/compiling-postgresql-extensions-visual-studio-windows
  4. Build plpgsql_check.dll
  5. Install plugin
  6. copy plpgsql_check.dll to PostgreSQL\14\lib
  7. copy plpgsql_check.control and plpgsql_check--2.1.sql to PostgreSQL\14\share\extension

Checked on

  • gcc on Linux (against all supported PostgreSQL)
  • clang 3.4 on Linux (against PostgreSQL 10)
  • for success regress tests the PostgreSQL 10 or higher is required

Compilation against PostgreSQL 10 requires libICU!

Licence

Copyright (c) Pavel Stehule (pavel.stehule@gmail.com)

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Note

If you like it, send a postcard to address

Pavel Stehule
Skalice 12
256 01 Benesov u Prahy
Czech Republic

I invite any questions, comments, bug reports, patches on mail address pavel.stehule@gmail.com


Author: okbob
Source Code: https://github.com/okbob/plpgsql_check
License: View license

#postgresql 

Treebender: A Symbolic Natural Language Parsing Library for Rust

Treebender

A symbolic natural language parsing library for Rust, inspired by HDPSG.

What is this?

This is a library for parsing natural or constructed languages into syntax trees and feature structures. There's no machine learning or probabilistic models, everything is hand-crafted and deterministic.

You can find out more about the motivations of this project in this blog post.

But what are you using it for?

I'm using this to parse a constructed language for my upcoming xenolinguistics game, Themengi.

Motivation

Using a simple 80-line grammar, introduced in the tutorial below, we can parse a simple subset of English, checking reflexive pronoun binding, case, and number agreement.

$ cargo run --bin cli examples/reflexives.fgr
> she likes himself
Parsed 0 trees

> her likes herself
Parsed 0 trees

> she like herself
Parsed 0 trees

> she likes herself
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: she))
  (1..2: TV (1..2: likes))
  (2..3: N (2..3: herself)))
[
  child-2: [
    case: acc
    pron: ref
    needs_pron: #0 she
    num: sg
    child-0: [ word: herself ]
  ]
  child-1: [
    tense: nonpast
    child-0: [ word: likes ]
    num: #1 sg
  ]
  child-0: [
    child-0: [ word: she ]
    case: nom
    pron: #0
    num: #1
  ]
]

Low resource language? Low problem! No need to train on gigabytes of text, just write a grammar using your brain. Let's hypothesize that in American Sign Language, topicalized nouns (expressed with raised eyebrows) must appear first in the sentence. We can write a small grammar (18 lines), and plug in some sentences:

$ cargo run --bin cli examples/asl-wordorder.fgr -n
> boy sit
Parsed 1 tree
(0..2: S
  (0..1: NP ((0..1: N (0..1: boy))))
  (1..2: IV (1..2: sit)))

> boy throw ball
Parsed 1 tree
(0..3: S
  (0..1: NP ((0..1: N (0..1: boy))))
  (1..2: TV (1..2: throw))
  (2..3: NP ((2..3: N (2..3: ball)))))

> ball nm-raised-eyebrows boy throw
Parsed 1 tree
(0..4: S
  (0..2: NP
    (0..1: N (0..1: ball))
    (1..2: Topic (1..2: nm-raised-eyebrows)))
  (2..3: NP ((2..3: N (2..3: boy))))
  (3..4: TV (3..4: throw)))

> boy throw ball nm-raised-eyebrows
Parsed 0 trees

Tutorial

As an example, let's say we want to build a parser for English reflexive pronouns (himself, herself, themselves, themself, itself). We'll also support number ("He likes X" v.s. "They like X") and simple embedded clauses ("He said that they like X").

Grammar files are written in a custom language, similar to BNF, called Feature GRammar (.fgr). There's a VSCode syntax highlighting extension for these files available as fgr-syntax.

We'll start by defining our lexicon. The lexicon is the set of terminal symbols (symbols in the actual input) that the grammar will match. Terminal symbols must start with a lowercase letter, and non-terminal symbols must start with an uppercase letter.

// pronouns
N -> he
N -> him
N -> himself
N -> she
N -> her
N -> herself
N -> they
N -> them
N -> themselves
N -> themself

// names, lowercase as they are terminals
N -> mary
N -> sue
N -> takeshi
N -> robert

// complementizer
Comp -> that

// verbs -- intransitive, transitive, and clausal
IV -> falls
IV -> fall
IV -> fell

TV -> likes
TV -> like
TV -> liked

CV -> says
CV -> say
CV -> said

Next, we can add our sentence rules (they must be added at the top, as the first rule in the file is assumed to be the top-level rule):

// sentence rules
S -> N IV
S -> N TV N
S -> N CV Comp S

// ... previous lexicon ...

Assuming this file is saved as examples/no-features.fgr (which it is :wink:), we can test this file with the built-in CLI:

$ cargo run --bin cli examples/no-features.fgr
> he falls
Parsed 1 tree
(0..2: S
  (0..1: N (0..1: he))
  (1..2: IV (1..2: falls)))
[
  child-1: [ child-0: [ word: falls ] ]
  child-0: [ child-0: [ word: he ] ]
]

> he falls her
Parsed 0 trees

> he likes her
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: he))
  (1..2: TV (1..2: likes))
  (2..3: N (2..3: her)))
[
  child-2: [ child-0: [ word: her ] ]
  child-1: [ child-0: [ word: likes ] ]
  child-0: [ child-0: [ word: he ] ]
]

> he likes
Parsed 0 trees

> he said that he likes her
Parsed 1 tree
(0..6: S
  (0..1: N (0..1: he))
  (1..2: CV (1..2: said))
  (2..3: Comp (2..3: that))
  (3..6: S
    (3..4: N (3..4: he))
    (4..5: TV (4..5: likes))
    (5..6: N (5..6: her))))
[
  child-0: [ child-0: [ word: he ] ]
  child-2: [ child-0: [ word: that ] ]
  child-1: [ child-0: [ word: said ] ]
  child-3: [
    child-2: [ child-0: [ word: her ] ]
    child-1: [ child-0: [ word: likes ] ]
    child-0: [ child-0: [ word: he ] ]
  ]
]

> he said that he
Parsed 0 trees

This grammar already parses some correct sentences, and blocks some trivially incorrect ones. However, it doesn't care about number, case, or reflexives right now:

> she likes himself  // unbound reflexive pronoun
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: she))
  (1..2: TV (1..2: likes))
  (2..3: N (2..3: himself)))
[
  child-0: [ child-0: [ word: she ] ]
  child-2: [ child-0: [ word: himself ] ]
  child-1: [ child-0: [ word: likes ] ]
]

> him like her  // incorrect case on the subject pronoun, should be nominative
                // (he) instead of accusative (him)
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: him))
  (1..2: TV (1..2: like))
  (2..3: N (2..3: her)))
[
  child-0: [ child-0: [ word: him ] ]
  child-1: [ child-0: [ word: like ] ]
  child-2: [ child-0: [ word: her ] ]
]

> he like her  // incorrect verb number agreement
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: he))
  (1..2: TV (1..2: like))
  (2..3: N (2..3: her)))
[
  child-2: [ child-0: [ word: her ] ]
  child-1: [ child-0: [ word: like ] ]
  child-0: [ child-0: [ word: he ] ]
]

To fix this, we need to add features to our lexicon, and restrict the sentence rules based on features.

Features are added with square brackets, and are key: value pairs separated by commas. **top** is a special feature value, which basically means "unspecified" -- we'll come back to it later. Features that are unspecified are also assumed to have a **top** value, but sometimes explicitly stating top is more clear.

/// Pronouns
// The added features are:
// * num: sg or pl, whether this noun wants a singular verb (likes) or
//   a plural verb (like). note this is grammatical number, so for example
//   singular they takes plural agreement ("they like X", not *"they likes X")
// * case: nom or acc, whether this noun is nominative or accusative case.
//   nominative case goes in the subject, and accusative in the object.
//   e.g., "he fell" and "she likes him", not *"him fell" and *"her likes he"
// * pron: he, she, they, or ref -- what type of pronoun this is
// * needs_pron: whether this is a reflexive that needs to bind to another
//   pronoun.
N[ num: sg, case: nom, pron: he ]                    -> he
N[ num: sg, case: acc, pron: he ]                    -> him
N[ num: sg, case: acc, pron: ref, needs_pron: he ]   -> himself
N[ num: sg, case: nom, pron: she ]                   -> she
N[ num: sg, case: acc, pron: she ]                   -> her
N[ num: sg, case: acc, pron: ref, needs_pron: she]   -> herself
N[ num: pl, case: nom, pron: they ]                  -> they
N[ num: pl, case: acc, pron: they ]                  -> them
N[ num: pl, case: acc, pron: ref, needs_pron: they ] -> themselves
N[ num: sg, case: acc, pron: ref, needs_pron: they ] -> themself

// Names
// The added features are:
// * num: sg, as people are singular ("mary likes her" / *"mary like her")
// * case: **top**, as names can be both subjects and objects
//   ("mary likes her" / "she likes mary")
// * pron: whichever pronoun the person uses for reflexive agreement
//   mary    pron: she  => mary likes herself
//   sue     pron: they => sue likes themself
//   takeshi pron: he   => takeshi likes himself
N[ num: sg, case: **top**, pron: she ]  -> mary
N[ num: sg, case: **top**, pron: they ] -> sue
N[ num: sg, case: **top**, pron: he ]   -> takeshi
N[ num: sg, case: **top**, pron: he ]   -> robert

// Complementizer doesn't need features
Comp -> that

// Verbs -- intransitive, transitive, and clausal
// The added features are:
// * num: sg, pl, or **top** -- to match the noun numbers.
//   **top** will match either sg or pl, as past-tense verbs in English
//   don't agree in number: "he fell" and "they fell" are both fine
// * tense: past or nonpast -- this won't be used for agreement, but will be
//   copied into the final feature structure, and the client code could do
//   something with it
IV[ num:      sg, tense: nonpast ] -> falls
IV[ num:      pl, tense: nonpast ] -> fall
IV[ num: **top**, tense: past ]    -> fell

TV[ num:      sg, tense: nonpast ] -> likes
TV[ num:      pl, tense: nonpast ] -> like
TV[ num: **top**, tense: past ]    -> liked

CV[ num:      sg, tense: nonpast ] -> says
CV[ num:      pl, tense: nonpast ] -> say
CV[ num: **top**, tense: past ]    -> said

Now that our lexicon is updated with features, we can update our sentence rules to constrain parsing based on those features. This uses two new features, tags and unification. Tags allow features to be associated between nodes in a rule, and unification controls how those features are compatible. The rules for unification are:

  1. A string feature can unify with a string feature with the same value
  2. A top feature can unify with anything, and the nodes are merged
  3. A complex feature ([ ... ] structure) is recursively unified with another complex feature.

If unification fails anywhere, the parse is aborted and the tree is discarded. This allows the programmer to discard trees if features don't match.

// Sentence rules
// Intransitive verb:
// * Subject must be nominative case
// * Subject and verb must agree in number (copied through #1)
S -> N[ case: nom, num: #1 ] IV[ num: #1 ]
// Transitive verb:
// * Subject must be nominative case
// * Subject and verb must agree in number (copied through #2)
// * If there's a reflexive in the object position, make sure its `needs_pron`
//   feature matches the subject's `pron` feature. If the object isn't a
//   reflexive, then its `needs_pron` feature will implicitly be `**top**`, so
//   will unify with anything.
S -> N[ case: nom, pron: #1, num: #2 ] TV[ num: #2 ] N[ case: acc, needs_pron: #1 ]
// Clausal verb:
// * Subject must be nominative case
// * Subject and verb must agree in number (copied through #1)
// * Reflexives can't cross clause boundaries (*"He said that she likes himself"),
//   so we can ignore reflexives and delegate to inner clause rule
S -> N[ case: nom, num: #1 ] CV[ num: #1 ] Comp S

Now that we have this augmented grammar (available as examples/reflexives.fgr), we can try it out and see that it rejects illicit sentences that were previously accepted, while still accepting valid ones:

> he fell
Parsed 1 tree
(0..2: S
  (0..1: N (0..1: he))
  (1..2: IV (1..2: fell)))
[
  child-1: [
    child-0: [ word: fell ]
    num: #0 sg
    tense: past
  ]
  child-0: [
    pron: he
    case: nom
    num: #0
    child-0: [ word: he ]
  ]
]

> he like him
Parsed 0 trees

> he likes himself
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: he))
  (1..2: TV (1..2: likes))
  (2..3: N (2..3: himself)))
[
  child-1: [
    num: #0 sg
    child-0: [ word: likes ]
    tense: nonpast
  ]
  child-2: [
    needs_pron: #1 he
    num: sg
    child-0: [ word: himself ]
    pron: ref
    case: acc
  ]
  child-0: [
    child-0: [ word: he ]
    pron: #1
    num: #0
    case: nom
  ]
]

> he likes herself
Parsed 0 trees

> mary likes herself
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: mary))
  (1..2: TV (1..2: likes))
  (2..3: N (2..3: herself)))
[
  child-0: [
    pron: #0 she
    num: #1 sg
    case: nom
    child-0: [ word: mary ]
  ]
  child-1: [
    tense: nonpast
    child-0: [ word: likes ]
    num: #1
  ]
  child-2: [
    child-0: [ word: herself ]
    num: sg
    pron: ref
    case: acc
    needs_pron: #0
  ]
]

> mary likes themself
Parsed 0 trees

> sue likes themself
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: sue))
  (1..2: TV (1..2: likes))
  (2..3: N (2..3: themself)))
[
  child-0: [
    pron: #0 they
    child-0: [ word: sue ]
    case: nom
    num: #1 sg
  ]
  child-1: [
    tense: nonpast
    num: #1
    child-0: [ word: likes ]
  ]
  child-2: [
    needs_pron: #0
    case: acc
    pron: ref
    child-0: [ word: themself ]
    num: sg
  ]
]

> sue likes himself
Parsed 0 trees

If this is interesting to you and you want to learn more, you can check out my blog series, the excellent textbook Syntactic Theory: A Formal Introduction (2nd ed.), and the DELPH-IN project, whose work on the LKB inspired this simplified version.

Using from code

I need to write this section in more detail, but if you're comfortable with Rust, I suggest looking through the codebase. It's not perfect, it started as one of my first Rust projects (after migrating through F# -> TypeScript -> C in search of the right performance/ergonomics tradeoff), and it could use more tests, but overall it's not too bad.

Basically, the processing pipeline is:

  1. Make a Grammar struct
  • Grammar is defined in rules.rs.
  • The easiest way to make a Grammar is Grammar::parse_from_file, which is mostly a hand-written recusive descent parser in parse_grammar.rs. Yes, I recognize the irony here.
  1. It takes input (in Grammar::parse, which does everything for you, or Grammar::parse_chart, which just does the chart)
  2. The input is first chart-parsed in earley.rs
  3. Then, a forest is built from the chart, in forest.rs, using an algorithm I found in a very useful blog series I forget the URL for, because the algorithms in the academic literature for this are... weird.
  4. Finally, the feature unification is used to prune the forest down to only valid trees. It would be more efficient to do this during parsing, but meh.

The most interesting thing you can do via code and not via the CLI is probably getting at the raw feature DAG, as that would let you do things like pronoun coreference. The DAG code is in featurestructure.rs, and should be fairly approachable -- there's a lot of Rust ceremony around Rc<RefCell<...>> because using an arena allocation crate seemed too harlike overkill, but that is somewhat mitigated by the NodeRef type alias. Hit me up at https://vgel.me/contact if you need help with anything here!

Download Details:
Author: vgel
Source Code: https://github.com/vgel/treebender
License: MIT License

#rust  #machinelearning 

Franz  Becker

Franz Becker

1648803600

Plpgsql Check: Extension That Allows to Check Plpgsql Source Code.

plpgsql_check

I founded this project, because I wanted to publish the code I wrote in the last two years, when I tried to write enhanced checking for PostgreSQL upstream. It was not fully successful - integration into upstream requires some larger plpgsql refactoring - probably it will not be done in next years (now is Dec 2013). But written code is fully functional and can be used in production (and it is used in production). So, I created this extension to be available for all plpgsql developers.

If you like it and if you would to join to development of this extension, register yourself to postgresql extension hacking google group.

Features

  • check fields of referenced database objects and types inside embedded SQL
  • using correct types of function parameters
  • unused variables and function argumens, unmodified OUT argumens
  • partially detection of dead code (due RETURN command)
  • detection of missing RETURN command in function
  • try to identify unwanted hidden casts, that can be performance issue like unused indexes
  • possibility to collect relations and functions used by function
  • possibility to check EXECUTE stmt agaist SQL injection vulnerability

I invite any ideas, patches, bugreports.

plpgsql_check is next generation of plpgsql_lint. It allows to check source code by explicit call plpgsql_check_function.

PostgreSQL PostgreSQL 10, 11, 12, 13 and 14 are supported.

The SQL statements inside PL/pgSQL functions are checked by validator for semantic errors. These errors can be found by plpgsql_check_function:

Active mode

postgres=# CREATE EXTENSION plpgsql_check;
LOAD
postgres=# CREATE TABLE t1(a int, b int);
CREATE TABLE

postgres=#
CREATE OR REPLACE FUNCTION public.f1()
RETURNS void
LANGUAGE plpgsql
AS $function$
DECLARE r record;
BEGIN
  FOR r IN SELECT * FROM t1
  LOOP
    RAISE NOTICE '%', r.c; -- there is bug - table t1 missing "c" column
  END LOOP;
END;
$function$;

CREATE FUNCTION

postgres=# select f1(); -- execution doesn't find a bug due to empty table t1
  f1 
 ────
   
 (1 row)

postgres=# \x
Expanded display is on.
postgres=# select * from plpgsql_check_function_tb('f1()');
─[ RECORD 1 ]───────────────────────────
functionid │ f1
lineno     │ 6
statement  │ RAISE
sqlstate   │ 42703
message    │ record "r" has no field "c"
detail     │ [null]
hint       │ [null]
level      │ error
position   │ 0
query      │ [null]

postgres=# \sf+ f1
    CREATE OR REPLACE FUNCTION public.f1()
     RETURNS void
     LANGUAGE plpgsql
1       AS $function$
2       DECLARE r record;
3       BEGIN
4         FOR r IN SELECT * FROM t1
5         LOOP
6           RAISE NOTICE '%', r.c; -- there is bug - table t1 missing "c" column
7         END LOOP;
8       END;
9       $function$

Function plpgsql_check_function() has three possible formats: text, json or xml

select * from plpgsql_check_function('f1()', fatal_errors := false);
                         plpgsql_check_function                         
------------------------------------------------------------------------
 error:42703:4:SQL statement:column "c" of relation "t1" does not exist
 Query: update t1 set c = 30
 --                   ^
 error:42P01:7:RAISE:missing FROM-clause entry for table "r"
 Query: SELECT r.c
 --            ^
 error:42601:7:RAISE:too few parameters specified for RAISE
(7 rows)

postgres=# select * from plpgsql_check_function('fx()', format:='xml');
                 plpgsql_check_function                     
────────────────────────────────────────────────────────────────
 <Function oid="16400">                                        ↵
   <Issue>                                                     ↵
     <Level>error</level>                                      ↵
     <Sqlstate>42P01</Sqlstate>                                ↵
     <Message>relation "foo111" does not exist</Message>       ↵
     <Stmt lineno="3">RETURN</Stmt>                            ↵
     <Query position="23">SELECT (select a from foo111)</Query>↵
   </Issue>                                                    ↵
  </Function>
 (1 row)

Arguments

You can set level of warnings via function's parameters:

Mandatory arguments

  • function name or function signature - these functions requires function specification. Any function in PostgreSQL can be specified by Oid or by name or by signature. When you know oid or complete function's signature, you can use a regprocedure type parameter like 'fx()'::regprocedure or 16799::regprocedure. Possible alternative is using a name only, when function's name is unique - like 'fx'. When the name is not unique or the function doesn't exists it raises a error.

Optional arguments

relid DEFAULT 0 - oid of relation assigned with trigger function. It is necessary for check of any trigger function.

fatal_errors boolean DEFAULT true - stop on first error

other_warnings boolean DEFAULT true - show warnings like different attributes number in assignmenet on left and right side, variable overlaps function's parameter, unused variables, unwanted casting, ..

extra_warnings boolean DEFAULT true - show warnings like missing RETURN, shadowed variables, dead code, never read (unused) function's parameter, unmodified variables, modified auto variables, ..

performance_warnings boolean DEFAULT false - performance related warnings like declared type with type modificator, casting, implicit casts in where clause (can be reason why index is not used), ..

security_warnings boolean DEFAULT false - security related checks like SQL injection vulnerability detection

anyelementtype regtype DEFAULT 'int' - a real type used instead anyelement type

anyenumtype regtype DEFAULT '-' - a real type used instead anyenum type

anyrangetype regtype DEFAULT 'int4range' - a real type used instead anyrange type

anycompatibletype DEFAULT 'int' - a real type used instead anycompatible type

anycompatiblerangetype DEFAULT 'int4range' - a real type used instead anycompatible range type

without_warnings DEFAULT false - disable all warnings

all_warnings DEFAULT false - enable all warnings

newtable DEFAULT NULL, oldtable DEFAULT NULL - the names of NEW or OLD transitive tables. These parameters are required when transitive tables are used.

Triggers

When you want to check any trigger, you have to enter a relation that will be used together with trigger function

CREATE TABLE bar(a int, b int);

postgres=# \sf+ foo_trg
    CREATE OR REPLACE FUNCTION public.foo_trg()
         RETURNS trigger
         LANGUAGE plpgsql
1       AS $function$
2       BEGIN
3         NEW.c := NEW.a + NEW.b;
4         RETURN NEW;
5       END;
6       $function$

Missing relation specification

postgres=# select * from plpgsql_check_function('foo_trg()');
ERROR:  missing trigger relation
HINT:  Trigger relation oid must be valid

Correct trigger checking (with specified relation)

postgres=# select * from plpgsql_check_function('foo_trg()', 'bar');
                 plpgsql_check_function                 
--------------------------------------------------------
 error:42703:3:assignment:record "new" has no field "c"
(1 row)

For triggers with transitive tables you can set a oldtable or newtable parameters:

create or replace function footab_trig_func()
returns trigger as $$
declare x int;
begin
  if false then
    -- should be ok;
    select count(*) from newtab into x; 

    -- should fail;
    select count(*) from newtab where d = 10 into x;
  end if;
  return null;
end;
$$ language plpgsql;

select * from plpgsql_check_function('footab_trig_func','footab', newtable := 'newtab');

Mass check

You can use the plpgsql_check_function for mass check functions and mass check triggers. Please, test following queries:

-- check all nontrigger plpgsql functions
SELECT p.oid, p.proname, plpgsql_check_function(p.oid)
   FROM pg_catalog.pg_namespace n
   JOIN pg_catalog.pg_proc p ON pronamespace = n.oid
   JOIN pg_catalog.pg_language l ON p.prolang = l.oid
  WHERE l.lanname = 'plpgsql' AND p.prorettype <> 2279;

or

SELECT p.proname, tgrelid::regclass, cf.*
   FROM pg_proc p
        JOIN pg_trigger t ON t.tgfoid = p.oid 
        JOIN pg_language l ON p.prolang = l.oid
        JOIN pg_namespace n ON p.pronamespace = n.oid,
        LATERAL plpgsql_check_function(p.oid, t.tgrelid) cf
  WHERE n.nspname = 'public' and l.lanname = 'plpgsql'

or

-- check all plpgsql functions (functions or trigger functions with defined triggers)
SELECT
    (pcf).functionid::regprocedure, (pcf).lineno, (pcf).statement,
    (pcf).sqlstate, (pcf).message, (pcf).detail, (pcf).hint, (pcf).level,
    (pcf)."position", (pcf).query, (pcf).context
FROM
(
    SELECT
        plpgsql_check_function_tb(pg_proc.oid, COALESCE(pg_trigger.tgrelid, 0)) AS pcf
    FROM pg_proc
    LEFT JOIN pg_trigger
        ON (pg_trigger.tgfoid = pg_proc.oid)
    WHERE
        prolang = (SELECT lang.oid FROM pg_language lang WHERE lang.lanname = 'plpgsql') AND
        pronamespace <> (SELECT nsp.oid FROM pg_namespace nsp WHERE nsp.nspname = 'pg_catalog') AND
        -- ignore unused triggers
        (pg_proc.prorettype <> (SELECT typ.oid FROM pg_type typ WHERE typ.typname = 'trigger') OR
         pg_trigger.tgfoid IS NOT NULL)
    OFFSET 0
) ss
ORDER BY (pcf).functionid::regprocedure::text, (pcf).lineno

Passive mode

Functions should be checked on start - plpgsql_check module must be loaded.

Configuration

plpgsql_check.mode = [ disabled | by_function | fresh_start | every_start ]
plpgsql_check.fatal_errors = [ yes | no ]

plpgsql_check.show_nonperformance_warnings = false
plpgsql_check.show_performance_warnings = false

Default mode is by_function, that means that the enhanced check is done only in active mode - by plpgsql_check_function. fresh_start means cold start.

You can enable passive mode by

load 'plpgsql'; -- 1.1 and higher doesn't need it
load 'plpgsql_check';
set plpgsql_check.mode = 'every_start';

SELECT fx(10); -- run functions - function is checked before runtime starts it

Limits

plpgsql_check should find almost all errors on really static code. When developer use some PLpgSQL's dynamic features like dynamic SQL or record data type, then false positives are possible. These should be rare - in well written code - and then the affected function should be redesigned or plpgsql_check should be disabled for this function.

CREATE OR REPLACE FUNCTION f1()
RETURNS void AS $$
DECLARE r record;
BEGIN
  FOR r IN EXECUTE 'SELECT * FROM t1'
  LOOP
    RAISE NOTICE '%', r.c;
  END LOOP;
END;
$$ LANGUAGE plpgsql SET plpgsql.enable_check TO false;

A usage of plpgsql_check adds a small overhead (in enabled passive mode) and you should use it only in develop or preprod environments.

Dynamic SQL

This module doesn't check queries that are assembled in runtime. It is not possible to identify results of dynamic queries - so plpgsql_check cannot to set correct type to record variables and cannot to check a dependent SQLs and expressions.

When type of record's variable is not know, you can assign it explicitly with pragma type:

DECLARE r record;
BEGIN
  EXECUTE format('SELECT * FROM %I', _tablename) INTO r;
  PERFORM plpgsql_check_pragma('type: r (id int, processed bool)');
  IF NOT r.processed THEN
    ...

Attention: The SQL injection check can detect only some SQL injection vulnerabilities. This tool cannot be used for security audit! Some issues should not be detected. This check can raise false alarms too - probably when variable is sanitized by other command or when value is of some compose type. 

Refcursors

plpgsql_check should not to detect structure of referenced cursors. A reference on cursor in PLpgSQL is implemented as name of global cursor. In check time, the name is not known (not in all possibilities), and global cursor doesn't exist. It is significant break for any static analyse. PLpgSQL cannot to set correct type for record variables and cannot to check a dependent SQLs and expressions. A solution is same like dynamic SQL. Don't use record variable as target when you use refcursor type or disable plpgsql_check for these functions.

CREATE OR REPLACE FUNCTION foo(refcur_var refcursor)
RETURNS void AS $$
DECLARE
  rec_var record;
BEGIN
  FETCH refcur_var INTO rec_var; -- this is STOP for plpgsql_check
  RAISE NOTICE '%', rec_var;     -- record rec_var is not assigned yet error

In this case a record type should not be used (use known rowtype instead):

CREATE OR REPLACE FUNCTION foo(refcur_var refcursor)
RETURNS void AS $$
DECLARE
  rec_var some_rowtype;
BEGIN
  FETCH refcur_var INTO rec_var;
  RAISE NOTICE '%', rec_var;

Temporary tables

plpgsql_check cannot verify queries over temporary tables that are created in plpgsql's function runtime. For this use case it is necessary to create a fake temp table or disable plpgsql_check for this function.

In reality temp tables are stored in own (per user) schema with higher priority than persistent tables. So you can do (with following trick safetly):

CREATE OR REPLACE FUNCTION public.disable_dml()
RETURNS trigger
LANGUAGE plpgsql AS $function$
BEGIN
  RAISE EXCEPTION SQLSTATE '42P01'
     USING message = format('this instance of %I table doesn''t allow any DML operation', TG_TABLE_NAME),
           hint = format('you should to run "CREATE TEMP TABLE %1$I(LIKE %1$I INCLUDING ALL);" statement',
                         TG_TABLE_NAME);
  RETURN NULL;
END;
$function$;

CREATE TABLE foo(a int, b int); -- doesn't hold data ever
CREATE TRIGGER foo_disable_dml
   BEFORE INSERT OR UPDATE OR DELETE ON foo
   EXECUTE PROCEDURE disable_dml();

postgres=# INSERT INTO  foo VALUES(10,20);
ERROR:  this instance of foo table doesn't allow any DML operation
HINT:  you should to run "CREATE TEMP TABLE foo(LIKE foo INCLUDING ALL);" statement
postgres=# 

CREATE TABLE
postgres=# INSERT INTO  foo VALUES(10,20);
INSERT 0 1

This trick emulates GLOBAL TEMP tables partially and it allows a statical validation. Other possibility is using a [template foreign data wrapper] (https://github.com/okbob/template_fdw)

You can use pragma table and create ephemeral table:

BEGIN
   CREATE TEMP TABLE xxx(a int);
   PERFORM plpgsql_check_pragma('table: xxx(a int)');
   INSERT INTO xxx VALUES(10);

Dependency list

A function plpgsql_show_dependency_tb can show all functions, operators and relations used inside processed function:

postgres=# select * from plpgsql_show_dependency_tb('testfunc(int,float)');
┌──────────┬───────┬────────┬─────────┬────────────────────────────┐
│   type   │  oid  │ schema │  name   │           params           │
╞══════════╪═══════╪════════╪═════════╪════════════════════════════╡
│ FUNCTION │ 36008 │ public │ myfunc1 │ (integer,double precision) │
│ FUNCTION │ 35999 │ public │ myfunc2 │ (integer,double precision) │
│ OPERATOR │ 36007 │ public │ **      │ (integer,integer)          │
│ RELATION │ 36005 │ public │ myview  │                            │
│ RELATION │ 36002 │ public │ mytable │                            │
└──────────┴───────┴────────┴─────────┴────────────────────────────┘
(4 rows)

Profiler

The plpgsql_check contains simple profiler of plpgsql functions and procedures. It can work with/without a access to shared memory. It depends on shared_preload_libraries config. When plpgsql_check was initialized by shared_preload_libraries, then it can allocate shared memory, and function's profiles are stored there. When plpgsql_check cannot to allocate shared momory, the profile is stored in session memory.

Due dependencies, shared_preload_libraries should to contains plpgsql first

postgres=# show shared_preload_libraries ;
┌──────────────────────────┐
│ shared_preload_libraries │
╞══════════════════════════╡
│ plpgsql,plpgsql_check    │
└──────────────────────────┘
(1 row)

The profiler is active when GUC plpgsql_check.profiler is on. The profiler doesn't require shared memory, but if there are not shared memory, then the profile is limmitted just to active session.

When plpgsql_check is initialized by shared_preload_libraries, another GUC is available to configure the amount of shared memory used by the profiler: plpgsql_check.profiler_max_shared_chunks. This defines the maximum number of statements chunk that can be stored in shared memory. For each plpgsql function (or procedure), the whole content is split into chunks of 30 statements. If needed, multiple chunks can be used to store the whole content of a single function. A single chunk is 1704 bytes. The default value for this GUC is 15000, which should be enough for big projects containing hundred of thousands of statements in plpgsql, and will consume about 24MB of memory. If your project doesn't require that much number of chunks, you can set this parameter to a smaller number in order to decrease the memory usage. The minimum value is 50 (which should consume about 83kB of memory), and the maximum value is 100000 (which should consume about 163MB of memory). Changing this parameter requires a PostgreSQL restart.

The profiler will also retrieve the query identifier for each instruction that contains an expression or optimizable statement. Note that this requires pg_stat_statements, or another similar third-party extension), to be installed. There are some limitations to the query identifier retrieval:

  • if a plpgsql expression contains underlying statements, only the top level query identifier will be retrieved
  • the profiler doesn't compute query identifier by itself but relies on external extension, such as pg_stat_statements, for that. It means that depending on the external extension behavior, you may not be able to see a query identifier for some statements. That's for instance the case with DDL statements, as pg_stat_statements doesn't expose the query identifier for such queries.
  • a query identifier is retrieved only for instructions containing expressions. This means that plpgsql_profiler_function_tb() function can report less query identifier than instructions on a single line.

Attention: A update of shared profiles can decrease performance on servers under higher load.

The profile can be displayed by function plpgsql_profiler_function_tb:

postgres=# select lineno, avg_time, source from plpgsql_profiler_function_tb('fx(int)');
┌────────┬──────────┬───────────────────────────────────────────────────────────────────┐
│ lineno │ avg_time │                              source                               │
╞════════╪══════════╪═══════════════════════════════════════════════════════════════════╡
│      1 │          │                                                                   │
│      2 │          │ declare result int = 0;                                           │
│      3 │    0.075 │ begin                                                             │
│      4 │    0.202 │   for i in 1..$1 loop                                             │
│      5 │    0.005 │     select result + i into result; select result + i into result; │
│      6 │          │   end loop;                                                       │
│      7 │        0 │   return result;                                                  │
│      8 │          │ end;                                                              │
└────────┴──────────┴───────────────────────────────────────────────────────────────────┘
(9 rows)

The profile per statements (not per line) can be displayed by function plpgsql_profiler_function_statements_tb:

        CREATE OR REPLACE FUNCTION public.fx1(a integer)
         RETURNS integer
         LANGUAGE plpgsql
1       AS $function$
2       begin
3         if a > 10 then
4           raise notice 'ahoj';
5           return -1;
6         else
7           raise notice 'nazdar';
8           return 1;
9         end if;
10      end;
11      $function$

postgres=# select stmtid, parent_stmtid, parent_note, lineno, exec_stmts, stmtname
             from plpgsql_profiler_function_statements_tb('fx1');
┌────────┬───────────────┬─────────────┬────────┬────────────┬─────────────────┐
│ stmtid │ parent_stmtid │ parent_note │ lineno │ exec_stmts │    stmtname     │
╞════════╪═══════════════╪═════════════╪════════╪════════════╪═════════════════╡
│      0 │             ∅ │ ∅           │      2 │          0 │ statement block │
│      1 │             0 │ body        │      3 │          0 │ IF              │
│      2 │             1 │ then body   │      4 │          0 │ RAISE           │
│      3 │             1 │ then body   │      5 │          0 │ RETURN          │
│      4 │             1 │ else body   │      7 │          0 │ RAISE           │
│      5 │             1 │ else body   │      8 │          0 │ RETURN          │
└────────┴───────────────┴─────────────┴────────┴────────────┴─────────────────┘
(6 rows)

All stored profiles can be displayed by calling function plpgsql_profiler_functions_all:

postgres=# select * from plpgsql_profiler_functions_all();
┌───────────────────────┬────────────┬────────────┬──────────┬─────────────┬──────────┬──────────┐
│        funcoid        │ exec_count │ total_time │ avg_time │ stddev_time │ min_time │ max_time │
╞═══════════════════════╪════════════╪════════════╪══════════╪═════════════╪══════════╪══════════╡
│ fxx(double precision) │          1 │       0.01 │     0.01 │        0.00 │     0.01 │     0.01 │
└───────────────────────┴────────────┴────────────┴──────────┴─────────────┴──────────┴──────────┘
(1 row)

There are two functions for cleaning stored profiles: plpgsql_profiler_reset_all() and plpgsql_profiler_reset(regprocedure).

Coverage metrics

plpgsql_check provides two functions:

  • plpgsql_coverage_statements(name)
  • plpgsql_coverage_branches(name)

Note

There is another very good PLpgSQL profiler - https://bitbucket.org/openscg/plprofiler

My extension is designed to be simple for use and practical. Nothing more or less.

plprofiler is more complex. It build call graphs and from this graph it can creates flame graph of execution times.

Both extensions can be used together with buildin PostgreSQL's feature - tracking functions.

set track_functions to 'pl';
...
select * from pg_stat_user_functions;

Tracer

plpgsql_check provides a tracing possibility - in this mode you can see notices on start or end functions (terse and default verbosity) and start or end statements (verbose verbosity). For default and verbose verbosity the content of function arguments is displayed. The content of related variables are displayed when verbosity is verbose.

postgres=# do $$ begin perform fx(10,null, 'now', e'stěhule'); end; $$;
NOTICE:  #0 ->> start of inline_code_block (Oid=0)
NOTICE:  #2   ->> start of function fx(integer,integer,date,text) (Oid=16405)
NOTICE:  #2        call by inline_code_block line 1 at PERFORM
NOTICE:  #2       "a" => '10', "b" => null, "c" => '2020-08-03', "d" => 'stěhule'
NOTICE:  #4     ->> start of function fx(integer) (Oid=16404)
NOTICE:  #4          call by fx(integer,integer,date,text) line 1 at PERFORM
NOTICE:  #4         "a" => '10'
NOTICE:  #4     <<- end of function fx (elapsed time=0.098 ms)
NOTICE:  #2   <<- end of function fx (elapsed time=0.399 ms)
NOTICE:  #0 <<- end of block (elapsed time=0.754 ms)

The number after # is a execution frame counter (this number is related to deep of error context stack). It allows to pair start end and of function.

Tracing is enabled by setting plpgsql_check.tracer to on. Attention - enabling this behaviour has significant negative impact on performance (unlike the profiler). You can set a level for output used by tracer plpgsql_check.tracer_errlevel (default is notice). The output content is limited by length specified by plpgsql_check.tracer_variable_max_length configuration variable.

In terse verbose mode the output is reduced:

postgres=# set plpgsql_check.tracer_verbosity TO terse;
SET
postgres=# do $$ begin perform fx(10,null, 'now', e'stěhule'); end; $$;
NOTICE:  #0 start of inline code block (oid=0)
NOTICE:  #2 start of fx (oid=16405)
NOTICE:  #4 start of fx (oid=16404)
NOTICE:  #4 end of fx
NOTICE:  #2 end of fx
NOTICE:  #0 end of inline code block

In verbose mode the output is extended about statement details:

postgres=# do $$ begin perform fx(10,null, 'now', e'stěhule'); end; $$;
NOTICE:  #0            ->> start of block inline_code_block (oid=0)
NOTICE:  #0.1       1  --> start of PERFORM
NOTICE:  #2              ->> start of function fx(integer,integer,date,text) (oid=16405)
NOTICE:  #2                   call by inline_code_block line 1 at PERFORM
NOTICE:  #2                  "a" => '10', "b" => null, "c" => '2020-08-04', "d" => 'stěhule'
NOTICE:  #2.1       1    --> start of PERFORM
NOTICE:  #2.1                "a" => '10'
NOTICE:  #4                ->> start of function fx(integer) (oid=16404)
NOTICE:  #4                     call by fx(integer,integer,date,text) line 1 at PERFORM
NOTICE:  #4                    "a" => '10'
NOTICE:  #4.1       6      --> start of assignment
NOTICE:  #4.1                  "a" => '10', "b" => '20'
NOTICE:  #4.1              <-- end of assignment (elapsed time=0.076 ms)
NOTICE:  #4.1                  "res" => '130'
NOTICE:  #4.2       7      --> start of RETURN
NOTICE:  #4.2                  "res" => '130'
NOTICE:  #4.2              <-- end of RETURN (elapsed time=0.054 ms)
NOTICE:  #4                <<- end of function fx (elapsed time=0.373 ms)
NOTICE:  #2.1            <-- end of PERFORM (elapsed time=0.589 ms)
NOTICE:  #2              <<- end of function fx (elapsed time=0.727 ms)
NOTICE:  #0.1          <-- end of PERFORM (elapsed time=1.147 ms)
NOTICE:  #0            <<- end of block (elapsed time=1.286 ms)

Special feature of tracer is tracing of ASSERT statement when plpgsql_check.trace_assert is on. When plpgsql_check.trace_assert_verbosity is DEFAULT, then all function's or procedure's variables are displayed when assert expression is false. When this configuration is VERBOSE then all variables from all plpgsql frames are displayed. This behaviour is independent on plpgsql.check_asserts value. It can be used, although the assertions are disabled in plpgsql runtime.

postgres=# set plpgsql_check.tracer to off;
postgres=# set plpgsql_check.trace_assert_verbosity TO verbose;

postgres=# do $$ begin perform fx(10,null, 'now', e'stěhule'); end; $$;
NOTICE:  #4 PLpgSQL assert expression (false) on line 12 of fx(integer) is false
NOTICE:   "a" => '10', "res" => null, "b" => '20'
NOTICE:  #2 PL/pgSQL function fx(integer,integer,date,text) line 1 at PERFORM
NOTICE:   "a" => '10', "b" => null, "c" => '2020-08-05', "d" => 'stěhule'
NOTICE:  #0 PL/pgSQL function inline_code_block line 1 at PERFORM
ERROR:  assertion failed
CONTEXT:  PL/pgSQL function fx(integer) line 12 at ASSERT
SQL statement "SELECT fx(a)"
PL/pgSQL function fx(integer,integer,date,text) line 1 at PERFORM
SQL statement "SELECT fx(10,null, 'now', e'stěhule')"
PL/pgSQL function inline_code_block line 1 at PERFORM

postgres=# set plpgsql.check_asserts to off;
SET
postgres=# do $$ begin perform fx(10,null, 'now', e'stěhule'); end; $$;
NOTICE:  #4 PLpgSQL assert expression (false) on line 12 of fx(integer) is false
NOTICE:   "a" => '10', "res" => null, "b" => '20'
NOTICE:  #2 PL/pgSQL function fx(integer,integer,date,text) line 1 at PERFORM
NOTICE:   "a" => '10', "b" => null, "c" => '2020-08-05', "d" => 'stěhule'
NOTICE:  #0 PL/pgSQL function inline_code_block line 1 at PERFORM
DO

Attention - SECURITY

Tracer prints content of variables or function arguments. For security definer function, this content can hold security sensitive data. This is reason why tracer is disabled by default and should be enabled only with super user rights plpgsql_check.enable_tracer.

Pragma

You can configure plpgsql_check behave inside checked function with "pragma" function. This is a analogy of PL/SQL or ADA language of PRAGMA feature. PLpgSQL doesn't support PRAGMA, but plpgsql_check detects function named plpgsql_check_pragma and get options from parameters of this function. These plpgsql_check options are valid to end of group of statements.

CREATE OR REPLACE FUNCTION test()
RETURNS void AS $$
BEGIN
  ...
  -- for following statements disable check
  PERFORM plpgsql_check_pragma('disable:check');
  ...
  -- enable check again
  PERFORM plpgsql_check_pragma('enable:check');
  ...
END;
$$ LANGUAGE plpgsql;

The function plpgsql_check_pragma is immutable function that returns one. It is defined by plpgsql_check extension. You can declare alternative plpgsql_check_pragma function like:

CREATE OR REPLACE FUNCTION plpgsql_check_pragma(VARIADIC args[])
RETURNS int AS $$
SELECT 1
$$ LANGUAGE sql IMMUTABLE;

Using pragma function in declaration part of top block sets options on function level too.

CREATE OR REPLACE FUNCTION test()
RETURNS void AS $$
DECLARE
  aux int := plpgsql_check_pragma('disable:extra_warnings');
  ...

Shorter syntax for pragma is supported too:

CREATE OR REPLACE FUNCTION test()
RETURNS void AS $$
DECLARE r record;
BEGIN
  PERFORM 'PRAGMA:TYPE:r (a int, b int)';
  PERFORM 'PRAGMA:TABLE: x (like pg_class)';
  ...

Supported pragmas

echo:str - print string (for testing)

status:check,status:tracer, status:other_warnings, status:performance_warnings, status:extra_warnings,status:security_warnings

enable:check,enable:tracer, enable:other_warnings, enable:performance_warnings, enable:extra_warnings,enable:security_warnings

disable:check,disable:tracer, disable:other_warnings, disable:performance_warnings, disable:extra_warnings,disable:security_warnings

type:varname typename or type:varname (fieldname type, ...) - set type to variable of record type

table: name (column_name type, ...) or table: name (like tablename) - create ephereal table

Pragmas enable:tracer and disable:tracerare active for Postgres 12 and higher

Compilation

You need a development environment for PostgreSQL extensions:

make clean
make install

result:

[pavel@localhost plpgsql_check]$ make USE_PGXS=1 clean
rm -f plpgsql_check.so   libplpgsql_check.a  libplpgsql_check.pc
rm -f plpgsql_check.o
rm -rf results/ regression.diffs regression.out tmp_check/ log/
[pavel@localhost plpgsql_check]$ make USE_PGXS=1 all
clang -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fpic -I/usr/local/pgsql/lib/pgxs/src/makefiles/../../src/pl/plpgsql/src -I. -I./ -I/usr/local/pgsql/include/server -I/usr/local/pgsql/include/internal -D_GNU_SOURCE   -c -o plpgsql_check.o plpgsql_check.c
clang -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fpic -I/usr/local/pgsql/lib/pgxs/src/makefiles/../../src/pl/plpgsql/src -shared -o plpgsql_check.so plpgsql_check.o -L/usr/local/pgsql/lib -Wl,--as-needed -Wl,-rpath,'/usr/local/pgsql/lib',--enable-new-dtags  
[pavel@localhost plpgsql_check]$ su root
Password: *******
[root@localhost plpgsql_check]# make USE_PGXS=1 install
/usr/bin/mkdir -p '/usr/local/pgsql/lib'
/usr/bin/mkdir -p '/usr/local/pgsql/share/extension'
/usr/bin/mkdir -p '/usr/local/pgsql/share/extension'
/usr/bin/install -c -m 755  plpgsql_check.so '/usr/local/pgsql/lib/plpgsql_check.so'
/usr/bin/install -c -m 644 plpgsql_check.control '/usr/local/pgsql/share/extension/'
/usr/bin/install -c -m 644 plpgsql_check--0.9.sql '/usr/local/pgsql/share/extension/'
[root@localhost plpgsql_check]# exit
[pavel@localhost plpgsql_check]$ make USE_PGXS=1 installcheck
/usr/local/pgsql/lib/pgxs/src/makefiles/../../src/test/regress/pg_regress --inputdir=./ --psqldir='/usr/local/pgsql/bin'    --dbname=pl_regression --load-language=plpgsql --dbname=contrib_regression plpgsql_check_passive plpgsql_check_active plpgsql_check_active-9.5
(using postmaster on Unix socket, default port)
============== dropping database "contrib_regression" ==============
DROP DATABASE
============== creating database "contrib_regression" ==============
CREATE DATABASE
ALTER DATABASE
============== installing plpgsql                     ==============
CREATE LANGUAGE
============== running regression test queries        ==============
test plpgsql_check_passive    ... ok
test plpgsql_check_active     ... ok
test plpgsql_check_active-9.5 ... ok

=====================
 All 3 tests passed. 
=====================

Compilation on Ubuntu

Sometimes successful compilation can require libicu-dev package (PostgreSQL 10 and higher - when pg was compiled with ICU support)

sudo apt install libicu-dev

Compilation plpgsql_check on Windows

You can check precompiled dll libraries http://okbob.blogspot.cz/2015/02/plpgsqlcheck-is-available-for-microsoft.html

or compile by self:

  1. Download and install PostgreSQL for Win32 from http://www.enterprisedb.com
  2. Download and install Microsoft Visual C++ Express
  3. Lern tutorial http://blog.2ndquadrant.com/compiling-postgresql-extensions-visual-studio-windows
  4. Build plpgsql_check.dll
  5. Install plugin
  6. copy plpgsql_check.dll to PostgreSQL\14\lib
  7. copy plpgsql_check.control and plpgsql_check--2.1.sql to PostgreSQL\14\share\extension

Checked on

  • gcc on Linux (against all supported PostgreSQL)
  • clang 3.4 on Linux (against PostgreSQL 10)
  • for success regress tests the PostgreSQL 10 or higher is required

Compilation against PostgreSQL 10 requires libICU!

Licence

Copyright (c) Pavel Stehule (pavel.stehule@gmail.com)

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Note

If you like it, send a postcard to address

Pavel Stehule
Skalice 12
256 01 Benesov u Prahy
Czech Republic

I invite any questions, comments, bug reports, patches on mail address pavel.stehule@gmail.com


Author: okbob
Source Code: https://github.com/okbob/plpgsql_check
License: View license

#postgresql