1582773970
Tools for managing “global state” are plentiful these days, but most of these tools:
React Query exports a set of hooks that address these issues. Out of the box, React Query:
Zeit’s SWR is a great library, and is very similar is spirit and implementation to React Query with a few notable differences:
todos
in its key, regardless of variables, or you can target specific queries with (or without) variables, and even use functional filtering to select queries in most places. This architecture is much more robust and forgiving especially for larger apps.<link rel='preload'>
and/or manually fetching and updating the query cacheusePaginatedQuery
useInfiniteQuery
useQuery
usePaginatedQuery
useInfiniteQuery
useMutation
queryCache
queryCache.prefetchQuery
queryCache.getQueryData
queryCache.setQueryData
queryCache.refetchQueries
queryCache.removeQueries
queryCache.getQuery
queryCache.isFetching
queryCache.subscribe
queryCache.clear
useIsFetching
ReactQueryConfigProvider
setConsole
source-shell
$ npm i --save react-query
# or
$ yarn add react-query
To make a new query, call the useQuery
hook with at least:
const info = useQuery('todos', fetchTodoList)
The unique key you provide is used internally for refetching, caching, deduping related queries.
This key can be whatever you’d like it to be as long as:
The query info
returned contains all information about the query and can be easily destructured and used in your component:
function Todos() {
const { status, data, error } = useQuery('todos', fetchTodoList)
return (
<div>
{status === 'loading' ? (
<span>Loading...</span>
) : status === 'error' ? (
<span>Error: {error.message}</span>
) : (
// also status === 'success', but "else" logic works, too
<ul>
{data.map(todo => (
<li key={todo.id}>{todo.title}</li>
))}
</ul>
)}
</div>
)
}
At its core, React Query manages query caching for you and uses a serializable array or “query key” to do this. Using a query key that is simple and unique to the query’s data is very important. In other similar libraries you’ll see the use of URL’s and/or GraphQL query template strings to achieve this, but we believe at scale, this becomes prone to typos and errors. To relieve this issue, React Query Keys can be strings or an array with a string and then any number of serializable primitives and/or objects.
The simplest form of a key is actually not an array, but an individual string. When a string query key is passed, it is converted to an array internally with the string as the only item in the query key. This format is useful for:
// A list of todos
useQuery('todos', ...) // queryKey === ['todos']
// Something else, whatever!
useQuery('somethingSpecial', ...) // queryKey === ['somethingSpecial']
When a query needs more information to uniquely describe its data, you can use an array with a string and any number of serializable objects to describe it. This is useful for:
// A list of todos that are "done"
useQuery(['todos', { status: 'done' }], ...) // queryKey === ['todos', { status: 'done' }]
// An individual todo
useQuery(['todos', 5], ...) // queryKey === ['todos', 5]
// And individual todo in a "preview" format
useQuery(['todos', 5, { preview: true }], ...) // queryKey === ['todos', 5, { preview: 'true' } }]
This means that no matter the order of keys in objects, all of the following queries would result in the same final query key of ['todos', { page, status }]
:
useQuery(['todos', { status, page }], ...)
useQuery(['todos', { page, status }], ...)
useQuery(['todos', { page, status, other: undefined }], ...)
The following query keys, however, are not equal. Array item order matters!
useQuery(['todos', status, page], ...)
useQuery(['todos', page, status], ...)
useQuery(['todos', undefined, page, status], ...)
To use external props, state, or variables in a query function, it’s easiest to pass them as an items in your array query keys! All query keys get passed through to your query function as parameters in the order they appear in the array key:
function Todos({ completed }) {
const { status, data, error } = useQuery(
['todos', { completed, page }],
fetchTodoList
)
}
// Access the key, status and page variables in your query function!
function fetchTodoList(key, { status, page }) {
return new Promise()
// ...
}
If you send through more items in your query key, they will also be available in your query function:
function Todo({ todoId, preview }) {
const { status, data, error } = useQuery(
['todo', todoId, { preview }],
fetchTodoById
)
}
// Access status and page in your query function!
function fetchTodoById(key, todoId, { preview }) {
return new Promise()
// ...
}
Whenever a query’s key changes, the query will automatically update. In the following example, a new query is created whenever todoId
changes:
function Todo({ todoId }) {
const { status, data, error } = useQuery(['todo', todoId], fetchTodo)
}
In some scenarios, you may find yourself needing to pass extra information to your query that shouldn’t (or doesn’t need to be) a part of the query key. useQuery
, usePaginatedQuery
and useInfiniteQuery
all support passing an optional array of additional parameters to be passed to your query function:
function Todo({ todoId, preview }) {
const { status, data, error } = useQuery(
// These will be used as the query key
['todo', todoId],
// These will get passed directly to our query function
[
debug,
{
foo: true,
bar: false,
},
],
fetchTodoById
)
}
function fetchTodoById(key, todoId, debug, { foo, bar }) {
return new Promise()
// ...
}
React Query makes it easy to make queries that depend on other queries for both:
To do this, you can use the following 2 approaches:
If a query isn’t ready to be requested yet, just pass a falsey value as the query key or as an item in the query key:
// Get the user
const { data: user } = useQuery(['user', { email }], getUserByEmail)
// Then get the user's projects
const { data: projects } = useQuery(
// `user` would be `null` at first (falsey),
// so the query will not execute until the user exists
user && ['projects', { userId: user.id }],
getProjectsByUser
)
Similar to above, you can also pass falsey items in you query key array:
// Only get the user when `email` is available
const { data: user } = useQuery(['user', email], getUserByEmail)
// Then get the user's projects
const { data: projects } = useQuery(
// `user && user.id` would be (falsey) at first,
// so the query will not execute until the user exists
['projects', user && user.id], // You could also do `user?.id` if you're using the latest babel!
getProjectsByUser
)
If a function is passed, the query will not execute until the function can be called without throwing:
// Get the user
const { data: user } = useQuery(['user', { email }])
// Then get the user's projects
const { data: projects } = useQuery(
// This will throw trying to access property `id` of `undefined` until the `user` is available
() => ['projects', { userId: user.id }]
)
const [ready, setReady] = React.useState(false)
// Get the user when we are `ready`
const { data: user } = useQuery(ready && ['user', { email }]) // Wait for ready to be truthy
// Then get the user's projects
const { data: projects } = useQuery(
() => ['projects', { userId: user.id }] // Wait for user.id to become available (and not throw)
React Query caching is automatic out of the box. It uses a stale-while-revalidate
in-memory caching strategy together with robust query deduping to always ensure a query’s data is only cached when it’s needed and only cached once even if that query is used multiple times across your application.
At a glance:
staleTime
option at both the global and query-level.cacheTime
option at both the global and query-level.cacheTime
specified (defaults to 5 minutes), the query is deleted and garbage collected.A more detailed example of the caching lifecycle
Let’s assume we are using the default cacheTime
of 5 minutes and the default staleTime
of 0
.
useQuery('todos', fetchTodos)
mounts.
'todos'
and `` as the unique identifiers for that cache.staleTime
option as a delay (defaults to 0
, or immediately).useQuery('todos', fetchTodos)
mounts elsewhere.
useQuery('todos', fetchTodos)
query are unmounted and no longer in use.
cacheTime
to delete and garbage collect the query (defaults to 5 minutes).useQuery('todos', fetchTodos)
appear within 5 minutes.
usePaginatedQuery
Rendering paginated data is a very common UI pattern to avoid overloading bandwidth or even your UI. React Query exposes a usePaginatedQuery
that is very similar to useQuery
that helps with this very scenario.
Consider the following example where we would ideally want to increment a pageIndex (or cursor) for a query. If we were to use useQuery
, it would technically work fine, but the UI would jump in and out of the success
and loading
states as different queries are created and destroyed for each page or cursor. By using usePaginatedQuery
we get a few new things:
data
, you should use resolvedData
instead. This is the data from the last known successful query result. As new page queries resolve, resolvedData
remains available to show the last page’s data while a new page is requested. When the new page data is received, resolvedData
get’s updated to the new page’s data.latestData
is available. When the desired page is being requested, latestData
will be undefined
until the query resolves, then it will get updated with the latest pages data result.function Todos() {
const [page, setPage] = React.useState(0)
const fetchProjects = (key, page = 0) => fetch('/api/projects?page=' + page)
const {
status,
resolvedData,
latestData,
error,
isFetching,
} = usePaginatedQuery(['todos', page], fetchProjects)
return (
<div>
{status === 'loading' ? (
<div>Loading...</div>
) : status === 'error' ? (
<div>Error: {error.message}</div>
) : (
// `resolvedData` will either resolve to the latest page's data
// or if fetching a new page, the last successful page's data
<div>
{resolvedData.projects.map(project => (
<p key={project.id}>{project.name}</p>
))}
</div>
)}
<span>Current Page: {page + 1}</span>
<button
onClick={() => setPage(old => Math.max(old - 1, 0))}
disabled={page === 0}
>
Previous Page
</button>{' '}
<button
onClick={() =>
// Here, we use `latestData` so the Next Page
// button isn't relying on potentially old data
setPage(old => (!latestData || !latestData.hasMore ? old : old + 1))
}
disabled={!latestData || !latestData.hasMore}
>
Next Page
</button>
{// Since the last page's data potentially sticks around between page requests,
// we can use `isFetching` to show a background loading
// indicator since our `status === 'loading'` state won't be triggered
isFetching ? <span> Loading...</span> : null}{' '}
</div>
)
}
useInfiniteQuery
Rendering lists that can additively “load more” data onto an existing set of data or “infinite scroll” is also a very common UI pattern. React Query supports a useful version of useQuery
called useInfiniteQuery
for querying these types of lists.
When using useInfiniteQuery
, you’ll notice a few things are different:
data
is now an array of arrays that contain query group results, instead of the query results themselvesfetchMore
function is now availablegetFetchMore
option is available for both determining if there is more data to load and the information to fetch it. This information is supplied as an additional parameter in the query function (which can optionally be overridden when calling the fetchMore
function)canFetchMore
boolean is now available and is true
if getFetchMore
returns a truthy valueisFetchingMore
boolean is now available to distinguish between a background refresh state and a loading more stateLet’s assume we have an API that returns pages of projects
3 at a time based on a cursor
index along with a cursor that can be used to fetch the next group of projects
fetch('/api/projects?cursor=0')
// { data: [...], nextCursor: 3}
fetch('/api/projects?cursor=3')
// { data: [...], nextCursor: 6}
fetch('/api/projects?cursor=6')
// { data: [...], nextCursor: 9}
fetch('/api/projects?cursor=9')
// { data: [...] }
With this information, we can create a “Load More” UI by:
useInfiniteQuery
to request the first group of data by defaultgetFetchMore
fetchMore
functionNote: It’s very important you do not call
fetchMore
with arguments unless you want them to override thefetchMoreInfo
data returned from thegetFetchMore
function. eg. Do not do this:<button onClick={fetchMore} />
as this would send the onClick event to thefetchMore
function.
import { useInfiniteQuery } from 'react-query'
function Projects() {
const fetchProjects = (key, cursor = 0) =>
fetch('/api/projects?cursor=' + cursor)
const {
status,
data,
isFetching,
isFetchingMore,
fetchMore,
canFetchMore,
} = useInfiniteQuery('projects', fetchProjects, {
getFetchMore: (lastGroup, allGroups) => lastGroup.nextCursor,
})
return status === 'loading' ? (
<p>Loading...</p>
) : status === 'error' ? (
<p>Error: {error.message}</p>
) : (
<>
{data.map((group, i) => (
<React.Fragment key={i}>
{group.projects.map(project => (
<p key={project.id}>{project.name}</p>
))}
</React.Fragment>
))}
<div>
<button
onClick={() => fetchMore()}
disabled={!canFetchMore || isFetchingMore}
>
{isFetchingMore
? 'Loading more...'
: canFetchMore
? 'Load More'
: 'Nothing more to load'}
</button>
</div>
<div>{isFetching && !isFetchingMore ? 'Fetching...' : null}</div>
</>
)
}
When an infinite query becomes stale
and needs to be refetched, each group is fetched individually
and in parallel with the same variables that were originally used to request each group. If an infinite query’s results are ever removed from the cache, the pagination restarts at the initial state with only the initial group being requested.
By default, the info returned from getFetchMore
will be supplied to the query function, but in some cases, you may want to override this. You can pass custom variables to the fetchMore
function which will override the default info like so:
function Projects() {
const fetchProjects = (key, cursor = 0) =>
fetch('/api/projects?cursor=' + cursor)
const {
status,
data,
isFetching,
isFetchingMore,
fetchMore,
canFetchMore,
} = useInfiniteQuery('projects', fetchProjects, {
getFetchMore: (lastGroup, allGroups) => lastGroup.nextCursor,
})
// Pass your own custom fetchMoreInfo
const skipToCursor50 = () => fetchMore(50)
}
Out of the box, “scroll restoration” for all queries (including paginated and infinite queries) Just Works™️ in React Query. The reason for this is that query results are cached and able to be retrieved synchronously when a query is rendered. As long as your queries are being cached long enough (the default time is 5 minutes) and have not been garbage collected, you should never experience any problems with scroll restoration.
If you ever want to disable a query from automatically running, you can use the manual = true
option. When manual
is set to true:
status === 'success'
statePro Tip #1: Because manual queries start in the
status === 'success'
state, you should consider supplying aninitialData
option to pre-populate the cache or similarly use a default parameter value when destructuring the query result
Pro Tip #2: Don’t use
manual
for dependent queries. Use Dependent Queries instead!
function Todos() {
const { status, data, error, refetch, isFetching } = useQuery(
'todos',
fetchTodoList,
{
manual: true,
initialData: [],
}
)
return (
<>
<button onClick={() => refetch()}>Fetch Todos</button>
{status === 'loading' ? (
<span>Loading...</span>
) : status === 'error' ? (
<span>Error: {error.message}</span>
) : (
// `status === 'success'` will be the initial state, so we need
// account for our initial data (an empty array)
<>
<ul>
{!data.length
? 'No todos yet...'
: data.map(todo => <li key={todo.id}>{todo.title}</li>)}
</ul>
<div>{isFetching ? 'Fetching...' : null}</div>
</>
)}
</>
)
}
When a useQuery
query fails (the function throws an error), React Query will automatically retry the query if that query’s request has not reached the max number of consecutive retries (defaults to 3
).
You can configure retries both on a global level and an individual query level.
retry = false
will disable retries.retry = 6
will retry failing requests 6 times before showing the final error thrown by the function.retry = true
will infinitely retry failing requests.import { useQuery } from 'react-query'
// Make specific query retry a certain number of times
const { status, data, error } = useQuery(
['todos', { page: 1 }],
fetchTodoList,
{
retry: 10, // Will retry failed requests 10 times before displaying an error
}
)
By default, retries in React Query do not happen immediately after a request fails. As is standard, a back-off delay is gradually applied to each retry attempt.
The default retryDelay
is set to double (starting at 1000
ms) with each attempt, but not exceed 30 seconds:
// Configure for all queries
import { ReactQueryConfigProvider } from 'react-query'
const queryConfig = {
retryDelay: attemptIndex => Math.min(1000 * 2 ** attemptIndex, 30000),
}
function App() {
return (
<ReactQueryConfigProvider config={queryConfig}>
...
</ReactQueryConfigProvider>
)
}
Though it is not recommended, you can obviously override the retryDelay
function/integer in both the Provider and individual query options. If set to an integer instead of a function the delay will always be the same amount of time:
const { status, data, error } = useQuery('todos', fetchTodoList, {
retryDelay: 10000, // Will always wait 1000ms to retry, regardless of how many retries
})
If you’re lucky enough, you may know enough about what your users will do to be able to prefetch the data they need before it’s needed! If this is the case, then you’re in luck. You can either use the prefetchQuery
function to prefetch the results of a query to be placed into the cache:
import { prefetchQuery } from 'react-query'
const prefetchTodos = async () => {
const queryData = await prefetchQuery('todos', () => fetch('/todos'))
// The results of this query will be cached like a normal query
}
The next time a useQuery
instance is used for a prefetched query, it will use the cached data! If no instances of useQuery
appear for a prefetched query, it will be deleted and garbage collected after the time specified in cacheTime
.
Alternatively, if you already have the data for your query synchronously available, you can use the[Query Cache’s setQueryData
method to directly add or update a query’s cached result
There may be times when you already have the initial data for a query synchronously available in your app. If and when this is the case, you can use the config.initialData
option to set the initial data for a query and skip the first round of fetching!
When providing an initialData
value that is anything other than undefined
:
status
will initialize as success
instead of loading
isStale
property will initialize as true
instead of falsefunction Todos() {
const queryInfo = useQuery('todos', () => fetch('/todos'), {
initialData: initialTodos,
})
}
If the process for accessing a query’s initial data is intensive or just not something you want to perform on every render, you can pass a function as the initialData
value. This function will be executed only once when the query is initialized, saving you precious memory and CPU:
function Todos() {
const queryInfo = useQuery('todos', () => fetch('/todos'), {
initialData: () => {
return getExpensiveTodos()
},
})
}
In some circumstances, you may be able to provide the initial data for a query from the cached result of another. A good example of this would be searching the cached data from a todos list query for an individual todo item, then using that as the initial data for your individual todo query:
function Todo({ todoId }) {
const queryInfo = useQuery(['todo', todoId], () => fetch('/todos'), {
initialData: () => {
// Use a todo from the 'todos' query as the initial data for this todo query
return queryCache.getQueryData('todos')?.find(d => d.id === todoId)
},
})
}
Most of the time, this pattern works well, but if your source query you’re using to look up the initial data from is old, you may not want to use the data at all and just fetch from the server. To make this decision easier, you can use the queryCache.getQuery
method instead to get more information about the source query, including an updatedAt
timestamp you can use to decide if the query is “fresh” enough for your needs:
function Todo({ todoId }) {
const queryInfo = useQuery(['todo', todoId], () => fetch('/todos'), {
initialData: () => {
// Get the query object
const query = queryCache.getQuery('todos')
// If the query exists and has data that is no older than 10 seconds...
if (query && Date.now() - query.updatedAt <= 10 * 1000) {
// return the individual todo
return query.state.data.find(d => d.id === todoId)
}
// Otherwise, return undefined and let it fetch!
},
})
}
When using SSR (server-side-rendering) with React Query there are a few things to note:
initialData
of an unfetched query. This means that by default, data
will be set to undefined
. To get around this in SSR, you can either pre-seed a query’s cache data using the config.initialData
option:const { status, data, error } = useQuery('todos', fetchTodoList, {
initialData: [{ id: 0, name: 'Implement SSR!' }],
})
// data === [{ id: 0, name: 'Implement SSR!'}]
Or, alternatively you can just destructure from undefined
in your query results:
const { status, data = [{ id: 0, name: 'Implement SSR!' }], error } = useQuery(
'todos',
fetchTodoList
)
The query’s state will still reflect that it is stale and has not been fetched yet, and once mounted, it will continue as normal and request a fresh copy of the query result.
React Query can also be used with React’s new Suspense for Data Fetching API’s. To enable this mode, you can set either the global or query level config’s suspense
option to true
.
Global configuration:
// Configure for all queries
import { ReactQueryConfigProvider } from 'react-query'
const queryConfig = {
suspense: true,
}
function App() {
return (
<ReactQueryConfigProvider config={queryConfig}>
...
</ReactQueryConfigProvider>
)
}
Query configuration:
const { useQuery } from 'react-query'
// Enable for an individual query
useQuery(queryKey, queryFn, { suspense: true })
When using suspense mode, status
states and error
objects are not needed and are then replaced by usage of the React.Suspense
component (including the use of the fallback
prop and React error boundaries for catching errors). Please see the Suspense Example for more information on how to set up suspense mode.
In addition to queries behaving differently in suspense mode, mutations also behave a bit differently. By default, instead of supplying the error
variable when a mutation fails, it will be thrown during the next render of the component it’s used in and propagate to the nearest error boundary, similar to query errors. If you wish to disable this, you can set the useErrorBoundary
option to false
. If you wish that errors are not thrown at all, you can set the throwOnError
option to false
as well!
Out of the box, React Query in suspense
mode works really well as a Fetch-on-render solution with no additional configuration. However, if you want to take it to the next level and implement a Fetch-as-you-render
model, we recommend implementing Prefetching on routing and/or user interactions events to initialize queries before they are needed.
By default, queries that become inactive before their promises are resolved are simply ignored instead of cancelled. Why is this?
But don’t worry! If your queries are high-bandwidth or potentially very expensive to download, React Query exposes a generic way to cancel query requests using a cancellation token or other related API. To integrate with this feature, attach a cancel
function to the promise returned by your query that implements your request cancellation. When a query becomes out-of-date or inactive, this promise.cancel
function will be called (if available):
Using axios
:
import { CancelToken } from 'axios'
const query = useQuery('todos', () => {
// Create a new CancelToken source for this request
const source = CancelToken.source()
const promise = axios.get('/todos', {
// Pass the source token to your request
cancelToken: source.token,
})
// Cancel the request if React Query calls the `promise.cancel` method
promise.cancel = () => {
source.cancel('Query was cancelled by React Query')
}
return promise
})
Using fetch
:
const query = useQuery('todos', () => {
// Create a new AbortController instance for this request
const controller = new AbortController()
// Get the abortController's signal
const signal = controller.signal
const promise = fetch('/todos', {
method: 'get',
// Pass the signal to your request
signal,
})
// Cancel the request if React Query calls the `promise.cancel` method
promise.cancel = controller.abort
return promise
})
Unlike queries, mutations are typically used to create/update/delete data or perform server side-effects. For this purpose, React Query exports a useMutation
hook.
Assuming the server implements a ping mutation, that returns “pong” string, here’s an example of the most basic mutation:
const PingPong = () => {
const [mutate, { status, data, error }] = useMutation(pingMutation)
const onPing = async () => {
try {
const data = await mutate()
console.log(data)
// { ping: 'pong' }
} catch {
// Uh oh, something went wrong
}
}
return <button onClick={onPing}>Ping</button>
}
Mutations without variables are not that useful, so let’s add some variables to closer match reality.
To pass variables
to your mutate
function, call mutate
with an object.
const CreateTodo = () => {
const [title, setTitle] = useState('')
const [mutate] = useMutation(createTodo)
const onCreateTodo = async e => {
// Prevent the form from refreshing the page
e.preventDefault()
try {
await mutate({ title })
// Todo was successfully created
} catch (error) {
// Uh oh, something went wrong
}
}
return (
<form onSubmit={onCreateTodo}>
<input
type="text"
value={title}
onChange={e => setTitle(e.target.value)}
/>
<br />
<button type="submit">Create Todo</button>
</form>
)
}
Even with just variables, mutations aren’t all that special, but when used with the onSuccess
option, the Query Cache’s refetchQueries
method method and the Query Cache’s setQueryData
method, mutations become a very powerful tool.
When a mutation succeeds, it’s likely that other queries in your application need to update. Where other libraries that use normalized caches would attempt to update locale queries with the new data imperatively, React Query helps you avoids the manual labor that comes with maintaining normalized caches and instead prescribes atomic updates and refetching instead of direct cache manipulation.
For example, assume we have a mutation to post a new todo:
const [mutate] = useMutation(postTodo)
When a successful postTodo
mutation happens, we likely want all todos
queries to get refetched to show the new todo item. To do this, you can use useMutation
’s onSuccess
options and the queryCache
’s refetchQueries
:
import { useMutation, queryCache } from 'react-query'
// When this mutation succeeds, refetch any queries with the `todos` or `reminders` query key
const [mutate] = useMutation(addTodo, {
onSuccess: () => {
queryCache.refetchQueries('todos')
queryCache.refetchQueries('reminders')
},
})
mutate(todo)
// The 3 queries below will be refetched when the mutation above succeeds
const todoListQuery = useQuery('todos', fetchTodoList)
const todoListQuery = useQuery(['todos', { page: 1 }], fetchTodoList)
const remindersQuery = useQuery('reminders', fetchReminders)
You can even refetch queries with specific variables by passing a more specific query key to the refetchQueries
method:
const [mutate] = useMutation(addTodo, {
onSuccess: () => {
queryCache.refetchQueries(['todos', { status: 'done' }])
},
})
mutate(todo)
// The query below will be refetched when the mutation above succeeds
const todoListQuery = useQuery(['todos', { status: 'done' }], fetchTodoList)
// However, the following query below will NOT be refetched
const todoListQuery = useQuery('todos', fetchTodoList)
The refetchQueries
API is very flexible, so even if you want to only refetch todos
queries that don’t have any more variables or sub keys, you can pass an exact: true
option to the refetchQueries
method:
const [mutate] = useMutation(addTodo, {
onSuccess: () => {
queryCache.refetchQueries('todos', { exact: true })
},
})
mutate(todo)
// The query below will be refetched when the mutation above succeeds
const todoListQuery = useQuery(['todos'], fetchTodoList)
// However, the following query below will NOT be refetched
const todoListQuery = useQuery(['todos', { status: 'done' }], fetchTodoList)
If you find yourself wanting even more granularity, you can pass a predicate function to the refetchQueries
method. This function will receive each query object from the queryCache and allow you return true
or false
for whether you want to refetch that query:
const [mutate] = useMutation(addTodo, {
onSuccess: () => {
queryCache.refetchQueries(
query => query.queryKey[0] === 'todos' && query.queryKey[1]?.version >= 10
)
},
})
mutate(todo)
// The query below will be refetched when the mutation above succeeds
const todoListQuery = useQuery(['todos', { version: 20 }], fetchTodoList)
// The query below will be refetched when the mutation above succeeds
const todoListQuery = useQuery(['todos', { version: 10 }], fetchTodoList)
// However, the following query below will NOT be refetched
const todoListQuery = useQuery(['todos', { version: 5 }], fetchTodoList)
If you prefer that the promise returned from mutate()
only resolves after the onSuccess
callback, you can return a promise in the onSuccess
callback:
const [mutate] = useMutation(addTodo, {
onSuccess: () =>
// return a promise!
queryCache.refetchQueries(
query => query.queryKey[0] === 'todos' && query.queryKey[1]?.version >= 10
),
})
const run = async () => {
try {
await mutate(todo)
console.log('I will only log after onSuccess is done!')
} catch {}
}
If you would like to refetch queries on error or even regardless of a mutation’s success or error, you can use the onError
or onSettled
callbacks:
const [mutate] = useMutation(addTodo, {
onError: error => {
// Refetch queries or more...
},
onSettled: (data, error) => {
// Refetch queries or more...
},
})
mutate(todo)
You might find that you want to override some of useMutation
’s options at the time of calling mutate
. To do that, you can optionally override them by sending them through as options to the mutate
function after your mutation variable. Supported option overrides are include:
onSuccess
onSettled
onError
throwOnError
const [mutate] = useMutation(addTodo)
mutate(todo, {
onSuccess: () => {},
onSettled: () => {},
onError: () => {},
throwOnError: true,
})
When dealing with mutations that update objects on the server, it’s common for the new object to be automatically returned in the response of the mutation. Instead of refetching any queries for that item and wasting a network call for data we already have, we can take advantage of the object returned by the mutation function and update the existing query with the new data immediately using the Query Cache’s setQueryData
method:
const [mutate] = useMutation(editTodo)
mutate(
{
id: 5,
name: 'Do the laundry',
},
{
onSuccess: data => queryCache.setQueryData(['todo', { id: 5 }], data),
}
)
// The query below will be updated with the response from the
// successful mutation
const { status, data, error } = useQuery(['todo', { id: 5 }], fetchTodoByID)
It’s sometimes the case that you need to clear the error
or data
of a mutation request. To do this, you can use the reset
function to handle this:
const CreateTodo = () => {
const [title, setTitle] = useState('')
const [mutate, { error, reset }] = useMutation(createTodo)
const onCreateTodo = async e => {
e.preventDefault()
await mutate({ title })
}
return (
<form onSubmit={onCreateTodo}>
{error && <h5 onClick={() => reset()}>{error}</h5>}
<input
type="text"
value={title}
onChange={e => setTitle(e.target.value)}
/>
<br />
<button type="submit">Create Todo</button>
</form>
)
}
In rare circumstances, you may want to manually update a query’s response with a custom value. To do this, you can again use the Query Cache’s setQueryData
method:
**It’s important to understand that when you manually or optimistically update a query’s data value, the potential that you display out-of-sync data to your users is very high. It’s recommended that you only do this if you plan to refetch the query very soon or perform a mutation to “commit” your manual changes (and also roll back your eager update if the refetch or mutation fails).
// Full replacement
queryCache.setQueryData(['todo', { id: 5 }], newTodo)
// or functional update
queryCache.setQueryData(['todo', { id: 5 }], previous => ({
...previous,
status: 'done',
}))
A query’s status === 'loading'
state is sufficient enough to show the initial hard-loading state for a query, but sometimes you may want to display an additional indicator that a query is refetching in the background. To do this, queries also supply you with an isFetching
boolean that you can use to show that it’s in a fetching state, regardless of the state of the status
variable:
function Todos() {
const { status, data: todos, error, isFetching } = useQuery(
'todos',
fetchTodos
)
return status === 'loading' ? (
<span>Loading...</span>
) : status === 'error' ? (
<span>Error: {error.message}</span>
) : (
<>
{isFetching ? <div>Refreshing...</div> : null}
<div>
{todos.map(todo => (
<Todo todo={todo} />
))}
</div>
</>
)
}
In addition to individual query loading states, if you would like to show a global loading indicator when any queries are fetching (including in the background), you can use the useIsFetching
hook:
import { useIsFetching } from 'react-query'
function GlobalLoadingIndicator() {
const isFetching = useIsFetching()
return isFetching ? (
<div>Queries are fetching in the background...</div>
) : null
}
If a user leaves your application and returns to stale data, you may want to trigger an update in the background to update any stale queries. Thankfully, React Query does this automatically for you, but if you choose to disable it, you can use the ReactQueryConfigProvider
’s refetchAllOnWindowFocus
option to disable it:
const queryConfig = { refetchAllOnWindowFocus: false }
function App() {
return (
<ReactQueryConfigProvider config={queryConfig}>
...
</ReactQueryConfigProvider>
)
}
In rare circumstances, you may want to manage your own window focus events that trigger React Query to revalidate. To do this, React Query provides a setFocusHandler
function that supplies you the callback that should be fired when the window is focused and allows you to set up your own events. When calling setFocusHandler
, the previously set handler is removed (which in most cases will be the default handler) and your new handler is used instead. For example, this is the default handler:
setFocusHandler(handleFocus => {
// Listen to visibillitychange and focus
if (typeof window !== 'undefined' && window.addEventListener) {
window.addEventListener('visibilitychange', handleFocus, false)
window.addEventListener('focus', handleFocus, false)
}
return () => {
// Be sure to unsubscribe if a new handler is set
window.removeEventListener('visibilitychange', handleFocus)
window.removeEventListener('focus', handleFocus)
}
})
A great use-case for replacing the focus handler is that of iframe events. Iframes present problems with detecting window focus by both double-firing events and also firing false-positive events when focusing or using iframes within your app. If you experience this, you should use an event handler that ignores these events as much as possible. I recommend this one! It can be set up in the following way:
import { setFocusHandler } from 'react-query'
import onWindowFocus from './onWindowFocus' // The gist above
setFocusHandler(onWindowFocus) // Boom!
WARNING: This is an advanced and experimental feature. There be dragons here. Do not change the Query Key Serializer unless you know what you are doing and are fine with encountering edge cases in React Query’s API
If you absolutely despise the default query key implementation, then please file an issue in this repo first. If you still believe you need something different, then you can choose to replace the default query key serializer with your own by using the ReactQueryConfigProvider
hook’s queryKeySerializerFn
option:
const queryConfig = {
queryKeySerializerFn: queryKey => {
// Your custom logic here...
// Make sure object keys are sorted and all values are
// serializable
const queryFnArgs = getQueryArgs(queryKey)
// Hash the query key args to get a string
const queryHash = hash(queryFnArgs)
// Return both the queryHash and normalizedQueryHash as a tuple
return [queryHash, queryFnArgs]
},
}
function App() {
return (
<ReactQueryConfigProvider config={queryConfig}>
...
</ReactQueryConfigProvider>
)
}
userQueryKey: any
useQuery
and all other public methods and utilities exported by React Query.queryFnArgs
queryHash: string
string
representing the entire query key.queryFnArgs: Array<any>
An additional
stableStringify
utility is also exported to help with stringifying objects to have sorted keys.
The example below shows how to build your own serializer for use with urls and use it with React Query:
import { ReactQueryConfigProvider, stableStringify } from 'react-query'
function urlQueryKeySerializer(queryKey) {
// Deconstruct the url
let [url, params = ''] = queryKey.split('?')
// Remove trailing slashes from the url to make an ID
url = url.replace(/\/{1,}$/, '')
// Build the searchQuery object
params.split('&').filter(Boolean)
// If there are search params, return a different key
if (Object.keys(params).length) {
let searchQuery = {}
params.forEach(param => {
const [key, value] = param.split('=')
searchQuery[key] = value
})
// Use stableStringify to turn searchQuery into a stable string
const searchQueryHash = stableStringify(searchQuery)
// Get the stable json object for the normalized key
searchQuery = JSON.parse(searchQueryHash)
return [`${url}_${searchQueryHash}`, [url, searchQuery]]
}
return [url, [url]]
}
const queryConfig = {
queryKeySerializerFn: urlQueryKeySerializer,
}
function App() {
return (
<ReactQueryConfigProvider config={queryConfig}>
...
</ReactQueryConfigProvider>
)
}
// Heck, you can even make your own custom useQueryHook!
function useUrlQuery(url, options) {
return useQuery(url, (url, params) =>
axios
.get(url, {
params,
})
.then(res => res.data)
)
}
// Use it in your app!
function Todos() {
const todosQuery = useUrlQuery(`/todos`)
}
function FilteredTodos({ status = 'pending' }) {
const todosQuery = useUrlQuery(`/todos?status=pending`)
}
function Todo({ id }) {
const todoQuery = useUrlQuery(`/todos/${id}`)
}
refetchQuery('/todos')
refetchQuery('/todos?status=pending')
refetchQuery('/todos/5')
The example below shows how to you build your own functional serializer and use it with React Query:
import { ReactQueryConfigProvider, stableStringify } from 'react-query'
// A map to keep track of our function pointers
const functionSerializerMap = new Map()
function functionQueryKeySerializer(queryKey) {
if (!queryKey) {
return []
}
let queryFn = queryKey
let variables
if (Array.isArray(queryKey)) {
queryFn = queryKey[0]
variables = queryKey[1]
}
// Get or create an ID for the function pointer
const queryGroupId =
functionSerializerMap.get(queryFn) ||
(() => {
const id = Date.now()
functionSerializerMap.set(queryFn, id)
return id
})()
const variablesIsObject = isObject(variables)
const variablesHash = variablesIsObject ? stableStringify(variables) : ''
const queryHash = `${queryGroupId}_${variablesHash}`
return [queryHash, queryGroupId, variablesHash, variables]
}
const queryConfig = {
queryKeySerializerFn: functionQueryKeySerializer,
}
function App() {
return (
<ReactQueryConfigProvider config={queryConfig}>
...
</ReactQueryConfigProvider>
)
}
// Heck, you can even make your own custom useQueryHook!
function useFunctionQuery(functionTuple, options) {
const [queryFn, variables] = Array.isArray(functionTuple)
? functionTuple
: [functionTuple]
return useQuery(functionTuple, queryFn, options)
}
// Use it in your app!
function Todos() {
const todosQuery = useFunctionQuery(getTodos)
}
function FilteredTodos({ status = 'pending' }) {
const todosQuery = useFunctionQuery([getTodos, { status }])
}
function Todo({ id }) {
const todoQuery = useFunctionQuery([getTodo, { id }])
}
refetchQuery(getTodos)
refetchQuery([getTodos, { status: 'pending' }])
refetchQuery([getTodo, { id: 5 }])
useQuery
const {
status,
data,
error,
isFetching,
failureCount,
refetch,
} = useQuery(queryKey, [, queryVariables], queryFn, {
manual,
retry,
retryDelay,
staleTime
cacheTime,
refetchInterval,
refetchIntervalInBackground,
refetchOnWindowFocus,
onSuccess,
onError,
onSettled,
suspense,
initialData,
refetchOnMount
})
queryKey: String | [String, Variables: Object] | falsey | Function => queryKey
[String, Object]
tuple is passed, they will be serialized into a stable query key. See Query Keys for more information.manual
is not set to true
).Variables: Object
queryFn: Function(variables) => Promise(data/error)
manual: Boolean
true
to disable automatic refetching when the query mounts or changes query keys.refetch
method returned from the useQuery
instance.retry: Boolean | Int
false
, failed queries will not retry by default.true
, failed queries will retry infinitely.Int
, e.g. 3
, failed queries will retry until the failed query count meets that number.retryDelay: Function(retryAttempt: Int) => Int
retryAttempt
integer and returns the delay to apply before the next attempt in milliseconds.attempt => Math.min(attempt > 1 ? 2 ** attempt * 1000 : 1000, 30 * 1000)
applies exponential backoff.attempt => attempt * 1000
applies linear backoff.staleTime: Int
cacheTime: Int
refetchInterval: false | Integer
refetchIntervalInBackground: Boolean
true
, queries that are set to continuously refetch with a refetchInterval
will continue to refetch while their tab/window is in the backgroundrefetchOnWindowFocus: Boolean
false
to disable automatic refetching on window focus (useful, when refetchAllOnWindowFocus
is set to true
).true
to enable automatic refetching on window focus (useful, when refetchAllOnWindowFocus
is set to false
.onSuccess: Function(data) => data
onError: Function(err) => void
onSettled: Function(data, error) => data
suspense: Boolean
true
to enable suspense mode.true
, useQuery
will suspend when status === 'loading'
true
, useQuery
will throw runtime errors when status === 'error'
initialData: any | Function() => any
refetchOnMount: Boolean
true
false
, will disable additional instances of a query to trigger background refetchesstatus: String
loading
if the query is in an initial loading state. This means there is no cached data and the query is currently fetching, eg isFetching === true
)error
if the query attempt resulted in an error. The corresponding error
property has the error received from the attempted fetchsuccess
if the query has received a response with no errors and is ready to display its data. The corresponding data
property on the query is the data received from the successful fetch or if the query is in manual
mode and has not been fetched yet data
is the first initialData
supplied to the query on initialization.data: Any
undefined
.error: null | Error
null
isFetching: Boolean
true
so long as manual
is set to false
true
if the query is currently fetching, including background fetching.failureCount: Integer
0
when the query succeeds.refetch: Function({ variables: Object, merge: Function, disableThrow: Boolean })
disableThrow
to true to disable this function from throwing if an error is encountered.usePaginatedQuery
const {
status,
resolvedData,
latestData,
error,
isFetching,
failureCount,
refetch,
} = usePaginatedQuery(queryKey, [, queryVariables], queryFn, {
manual,
retry,
retryDelay,
staleTime
cacheTime,
refetchInterval,
refetchIntervalInBackground,
refetchOnWindowFocus,
onSuccess,
onError,
suspense,
initialData,
refetchOnMount
})
queryKey: String | [String, Variables: Object] | falsey | Function => queryKey
[String, Object]
tuple is passed, they will be serialized into a stable query key. See Query Keys for more information.manual
is not set to true
).Variables: Object
queryFn: Function(variables) => Promise(data/error)
manual: Boolean
true
to disable automatic refetching when the query mounts or changes query keys.refetch
method returned from the useQuery
instance.retry: Boolean | Int
false
, failed queries will not retry by default.true
, failed queries will retry infinitely.Int
, e.g. 3
, failed queries will retry until the failed query count meets that number.retryDelay: Function(retryAttempt: Int) => Int
retryAttempt
integer and returns the delay to apply before the next attempt in milliseconds.attempt => Math.min(attempt > 1 ? 2 ** attempt * 1000 : 1000, 30 * 1000)
applies exponential backoff.attempt => attempt * 1000
applies linear backoff.staleTime: Int
cacheTime: Int
refetchInterval: false | Integer
refetchIntervalInBackground: Boolean
true
, queries that are set to continuously refetch with a refetchInterval
will continue to refetch while their tab/window is in the backgroundrefetchOnWindowFocus: Boolean
false
to disable automatic refetching on window focus (useful, when refetchAllOnWindowFocus
is set to true
).true
to enable automatic refetching on window focus (useful, when refetchAllOnWindowFocus
is set to false
.onSuccess: Function(data) => data
onError: Function(error) => void
onSettled: Function(data, error) => data
suspense: Boolean
true
to enable suspense mode.true
, useQuery
will suspend when status === 'loading'
true
, useQuery
will throw runtime errors when status === 'error'
initialData: any
refetchOnMount: Boolean
true
false
, will disable additional instances of a query to trigger background refetchesstatus: String
loading
if the query is in an initial loading state. This means there is no cached data and the query is currently fetching, eg isFetching === true
)error
if the query attempt resulted in an error. The corresponding error
property has the error received from the attempted fetchsuccess
if the query has received a response with no errors and is ready to display its data. The corresponding data
property on the query is the data received from the successful fetch or if the query is in manual
mode and has not been fetched yet data
is the first initialData
supplied to the query on initialization.resolveData: Any
undefined
.latestData: Any
undefined
.undefined
error: null | Error
null
isFetching: Boolean
true
so long as manual
is set to false
true
if the query is currently fetching, including background fetching.isFetchingMore: Boolean
paginated
mode, this will be true
when fetching more results using the fetchMore
function.failureCount: Integer
0
when the query succeeds.refetch: Function({ variables: Object, merge: Function, disableThrow: Boolean })
disableThrow
to true to disable this function from throwing if an error is encountered.fetchMore: Function(variables) => Promise
paginated
mode, this function allows you to fetch the next “page” of results.variables
should be an object that is passed to your query function to retrieve the next page of results.useInfiniteQuery
const queryFn = (...queryKey, nextPageVariables) => Promise
const {
status,
data,
error,
isFetching,
failureCount,
refetch,
} = useInfiniteQuery(queryKey, [, queryVariables], queryFn, {
getFetchMore: (lastPage, allPages) => nextPageVariables
manual,
retry,
retryDelay,
staleTime
cacheTime,
refetchInterval,
refetchIntervalInBackground,
refetchOnWindowFocus,
onSuccess,
onError,
suspense,
initialData,
refetchOnMount
})
queryKey: String | [String, Variables: Object] | falsey | Function => queryKey
[String, Object]
tuple is passed, they will be serialized into a stable query key. See Query Keys for more information.manual
is not set to true
).Variables: Object
queryFn: Function(variables) => Promise(data/error)
getFetchMore
function, used to fetch the next pagegetFetchMore: Function | Boolean
manual: Boolean
true
to disable automatic refetching when the query mounts or changes query keys.refetch
method returned from the useQuery
instance.retry: Boolean | Int
false
, failed queries will not retry by default.true
, failed queries will retry infinitely.Int
, e.g. 3
, failed queries will retry until the failed query count meets that number.retryDelay: Function(retryAttempt: Int) => Int
retryAttempt
integer and returns the delay to apply before the next attempt in milliseconds.attempt => Math.min(attempt > 1 ? 2 ** attempt * 1000 : 1000, 30 * 1000)
applies exponential backoff.attempt => attempt * 1000
applies linear backoff.staleTime: Int
cacheTime: Int
refetchInterval: false | Integer
refetchIntervalInBackground: Boolean
true
, queries that are set to continuously refetch with a refetchInterval
will continue to refetch while their tab/window is in the backgroundrefetchOnWindowFocus: Boolean
false
to disable automatic refetching on window focus (useful, when refetchAllOnWindowFocus
is set to true
).true
to enable automatic refetching on window focus (useful, when refetchAllOnWindowFocus
is set to false
.onSuccess: Function(data) => data
onError: Function(err) => void
onSettled: Function(data, error) => data
suspense: Boolean
true
to enable suspense mode.true
, useQuery
will suspend when status === 'loading'
true
, useQuery
will throw runtime errors when status === 'error'
initialData: any
refetchOnMount: Boolean
true
false
, will disable additional instances of a query to trigger background refetchesstatus: String
loading
if the query is in an initial loading state. This means there is no cached data and the query is currently fetching, eg isFetching === true
)error
if the query attempt resulted in an error. The corresponding error
property has the error received from the attempted fetchsuccess
if the query has received a response with no errors and is ready to display its data. The corresponding data
property on the query is the data received from the successful fetch or if the query is in manual
mode and has not been fetched yet data
is the first initialData
supplied to the query on initialization.data: Any
[]
.error: null | Error
null
isFetching: Boolean
true
so long as manual
is set to false
true
if the query is currently fetching, including background fetching.isFetchingMore: Boolean
paginated
mode, this will be true
when fetching more results using the fetchMore
function.failureCount: Integer
0
when the query succeeds.refetch: Function({ variables: Object, merge: Function, disableThrow: Boolean })
disableThrow
to true to disable this function from throwing if an error is encountered.fetchMore: Function(fetchMoreVariablesOverride) => Promise
paginated
mode, this function allows you to fetch the next “page” of results.variables
should be an object that is passed to your query function to retrieve the next page of results.canFetchMore: Boolean
paginated
mode, this will be true
if there is more data to be fetched (known via the required getFetchMore
option function).useMutation
const [mutate, { status, data, error }] = useMutation(mutationFn, {
onSuccess,
onSettled,
onError,
throwOnError,
useErrorBoundary,
})
const promise = mutate(variables, {
onSuccess,
onSettled,
onError,
throwOnError,
})
mutationFn: Function(variables) => Promise
variables: any
mutationFn
.onSuccess: Function(data) => Promise | undefined
onError: Function(err) => Promise | undefined
onSettled: Function(data, error) => Promise | undefined
throwOnError
false
true
if failed mutations should re-throw errors from the mutation function to the mutate
function.useErrorBoundary
useErrorBoundary
value, which is false
mutate: Function(variables, { onSuccess, onSettled, onError, throwOnError, })
status: String
loading
if the mutation is currently executing.error
if the last mutation attempt resulted in an error.data: null | Any
null
error: null | Error
promise: Promise
mutationFn
.queryCache
The queryCache
instance is the backbone of React Query that manages all of the state, caching, lifecycle and magic of every query. It supports relatively unrestricted, but safe, access to manipulate query’s as you need. Its available properties and methods are:
prefetchQuery
getQueryData
setQueryData
refetchQueries
removeQueries
getQuery
subscribe
isFetching
clear
queryCache.prefetchQuery
prefetchQuery
is an asynchronous function that can be used to fetch and cache a query response before it is needed or fetched with useQuery
. If the query does not exist, it will be created and immediately be marked as stale. If the query is not utilized by a query hook in the default cacheTime
of 5 minutes, the query will be garbage collected.
The difference between using
prefetchQuery
andupdateQuery
is thatprefetchQuery
is async and will ensure that duplicate requests for this query are not created withuseQuery
instances for the same query are rendered while the data is fetching.
import { prefetchQuery } from 'react-query'
const data = await prefetchQuery(queryKey, queryFn)
For convenience in syntax, you can also pass optional query variables to prefetchQuery
just like you can useQuery
:
import { prefetchQuery } from 'react-query'
const data = await prefetchQuery(queryKey, queryVariables, queryFn, config)
The options for prefetchQuery
are exactly the same as those of useQuery
with the exception of:
config.throwOnError: Boolean
true
if you want prefetchQuery
to throw an error when it encounters errors.promise: Promise
throwOnError
option to true
queryCache.getQueryData
getQueryData
is a synchronous function that can be used to get an existing query’s cached data. If the query does not exist, undefined
will be returned.
import { getQueryData } from 'react-query'
const data = getQueryData(queryKey)
queryKey: QueryKey
data: any | undefined
undefined
if the query does not exist.queryCache.setQueryData
setQueryData
is a synchronous function that can be used to immediately update a query’s cached data. If the query does not exist, it will be created and immediately be marked as stale. If the query is not utilized by a query hook in the default cacheTime
of 5 minutes, the query will be garbage collected.
The difference between using
setQueryData
andupdateQuery
is thatsetQueryData
is sync and assumes that you already synchronously have the data available. If you need to fetch the data asynchronously, it’s suggested that you either refetch the query key or useprefetchQuery
to handle the asynchronous fetch.
import { setQueryData } from 'react-query'
setQueryData(queryKey, updater)
queryKey: QueryKey
updater: Any | Function(oldData) => newData
setQueryData(queryKey, newData)
For convenience in syntax, you can also pass an updater function which receives the current data value and returns the new one:
setQueryData(queryKey, oldData => newData)
queryCache.refetchQueries
The refetchQueries
method can be used to refetch multiple queries in cache based on their query keys or any other functionally accessible property/state of the query.
import { queryCache } from 'react-query'
const queries = queryCache.refetchQueries(inclusiveQueryKeyOrPredicateFn, {
exact,
throwOnError,
})
queryKeyOrPredicateFn
can either be a Query Key or a function
queryKey: QueryKey
'todos'
, it would match queries with the todos
, ['todos']
, and ['todos', 5]
.Function(query) => Boolean
found
.exact
option has no effect with using a functionexact: Boolean
exact: true
option to return only the query with the exact query key you have passed. Don’t remember to destructure it out of the array!throwOnError: Boolean
true
, this function will throw if any of the query refetch tasks fail.This function returns a promise that will resolve when all of the queries are done being refetched. By default, it will not throw an error if any of those queries refetches fail, but this can be configured by setting the throwOnError
option to true
queryCache.removeQueries
The removeQueries
method can be used to remove queries from the cache based on their query keys or any other functionally accessible property/state of the query.
import { queryCache } from 'react-query'
const queries = queryCache.removeQueries(queryKeyOrPredicateFn, {
exact,
})
queryKeyOrPredicateFn
can either be a Query Key or a function
queryKey
'todos'
, it would match queries with the todos
, ['todos']
, and ['todos', 5]
.Function(query) => Boolean
found
.exact
option has no effect with using a functionexact: Boolean
exact: true
option to return only the query with the exact query key you have passed. Don’t remember to destructure it out of the array!This function does not return anything
queryCache.getQuery
getQuery
is a slightly more advanced synchronous function that can be used to get an existing query object from the cache. This object not only contains all the state for the query, but all of the instances, and underlying guts of the query as well. If the query does not exist, undefined
will be returned.
Note: This is not typically needed for most applications, but can come in handy when needing more information about a query in rare scenarios (eg. Looking at the query.updatedAt timestamp to decide whether a query is fresh enough to be used as an initial value)
import { getQuery } from 'react-query'
const query = getQuery(queryKey)
queryKey: QueryKey
query: QueryObect
queryCache.isFetching
This isFetching
property is an integer
representing how many queries, if any, in the cache are currently fetching (including background-fetching, loading new pages, or loading more infinite query results)
import { queryCache } from 'react-query'
if (queryCache.isFetching) {
console.log('At least one query is fetching!')
)
}
React Query also exports a handy useIsFetching
hook that will let you subscribe to this state in your components without creating a manual subscription to the query cache.
queryCache.subscribe
The subscribe
method can be used to subscribe to the query cache as a whole and be informed of safe/known updates to the cache like query states changing or queries being updated, added or removed
import { queryCache } from 'react-query'
const callback = cache => {}
const unsubscribe = queryCache.subscribe(callback)
callback: Function(queryCache) => void
query.setState
, queryCache.removeQueries
, etc). Out of scope mutations to the queryCache are not encouraged and will not fire subscription callbacksunsubscribe: Function => void
queryCache.clear
The clear
method can be used to clear the queryCache entirely and start fresh.
import { queryCache } from 'react-query'
const callback = cache => {}
queryCache.clear()
queries: Array<Query>
useIsFetching
useIsFetching
is an optional hook that returns true
if any query in your application is loading or fetching in the background (useful for app-wide loading indicators).
import { useIsFetching } from 'react-query'
const isFetching = useIsFetching()
isFetching: Boolean
true
if any query in your application is loading or fetching in the background.ReactQueryConfigProvider
ReactQueryConfigProvider
is an optional provider component and can be used to define defaults for all instances of useQuery
within it’s sub-tree:
import { ReactQueryConfigProvider } from 'react-query'
const queryConfig = {
// Global
suspense: false,
useErrorBoundary: undefined, // Defaults to the value of `suspense` if not defined otherwise
throwOnError: false,
refetchAllOnWindowFocus: true,
queryKeySerializerFn: queryKey => [queryHash, queryFnArgs],
onSuccess: () => {},
onError: () => {},
onSettled: () => {},
// useQuery
retry: 3,
retryDelay: attemptIndex => Math.min(1000 * 2 ** attemptIndex, 30000),
staleTime: 0,
cacheTime: 5 * 60 * 1000,
refetchInterval: false,
queryFnParamsFilter: args => filteredArgs,
refetchOnMount: true,
}
function App() {
return (
<ReactQueryConfigProvider config={queryConfig}>
...
</ReactQueryConfigProvider>
)
}
config: Object
useQuery
hook and the useMutation
hook.setConsole
setConsole
is an optional utility function that allows you replace the console
interface used to log errors. By default, the window.console
object is used. If no global console
object is found in the environment, nothing will be logged.
import { setConsole } from 'react-query'
import { printLog, printWarn, printError } from 'custom-logger'
setConsole({
log: printLog,
warn: printWarn,
error: printError,
})
console: Object
log
, warn
, and error
methods.#reactjs #webdeve #javascript
1667488080
A full Python implementation of the ROUGE metric, producing same results as in the official perl implementation.
Important remarks
<3e-5
for ROUGE-L as well as ROUGE-W and <4e-5
for ROUGE-N.-b 665
.In case of doubts, please see all the implemented tests to compare outputs between the official ROUGE-1.5.5 and this script.
Package is uploaded on PyPI <https://pypi.org/project/py-rouge>
_.
You can install it with pip:
pip install py-rouge
or do it manually:
git clone https://github.com/Diego999/py-rouge
cd py-rouge
python setup.py install
Issues/Pull Requests/Feedbacks
Don't hesitate to contact for any feedback or create issues/pull requests (especially if you want to rewrite the stemmer implemented in ROUGE-1.5.5 in python ;)).
Example
import rouge
def prepare_results(m, p, r, f):
return '\t{}:\t{}: {:5.2f}\t{}: {:5.2f}\t{}: {:5.2f}'.format(m, 'P', 100.0 * p, 'R', 100.0 * r, 'F1', 100.0 * f)
for aggregator in ['Avg', 'Best', 'Individual']:
print('Evaluation with {}'.format(aggregator))
apply_avg = aggregator == 'Avg'
apply_best = aggregator == 'Best'
evaluator = rouge.Rouge(metrics=['rouge-n', 'rouge-l', 'rouge-w'],
max_n=4,
limit_length=True,
length_limit=100,
length_limit_type='words',
apply_avg=apply_avg,
apply_best=apply_best,
alpha=0.5, # Default F1_score
weight_factor=1.2,
stemming=True)
hypothesis_1 = "King Norodom Sihanouk has declined requests to chair a summit of Cambodia 's top political leaders , saying the meeting would not bring any progress in deadlocked negotiations to form a government .\nGovernment and opposition parties have asked King Norodom Sihanouk to host a summit meeting after a series of post-election negotiations between the two opposition groups and Hun Sen 's party to form a new government failed .\nHun Sen 's ruling party narrowly won a majority in elections in July , but the opposition _ claiming widespread intimidation and fraud _ has denied Hun Sen the two-thirds vote in parliament required to approve the next government .\n"
references_1 = ["Prospects were dim for resolution of the political crisis in Cambodia in October 1998.\nPrime Minister Hun Sen insisted that talks take place in Cambodia while opposition leaders Ranariddh and Sam Rainsy, fearing arrest at home, wanted them abroad.\nKing Sihanouk declined to chair talks in either place.\nA U.S. House resolution criticized Hun Sen's regime while the opposition tried to cut off his access to loans.\nBut in November the King announced a coalition government with Hun Sen heading the executive and Ranariddh leading the parliament.\nLeft out, Sam Rainsy sought the King's assurance of Hun Sen's promise of safety and freedom for all politicians.",
"Cambodian prime minister Hun Sen rejects demands of 2 opposition parties for talks in Beijing after failing to win a 2/3 majority in recent elections.\nSihanouk refuses to host talks in Beijing.\nOpposition parties ask the Asian Development Bank to stop loans to Hun Sen's government.\nCCP defends Hun Sen to the US Senate.\nFUNCINPEC refuses to share the presidency.\nHun Sen and Ranariddh eventually form a coalition at summit convened by Sihanouk.\nHun Sen remains prime minister, Ranariddh is president of the national assembly, and a new senate will be formed.\nOpposition leader Rainsy left out.\nHe seeks strong assurance of safety should he return to Cambodia.\n",
]
hypothesis_2 = "China 's government said Thursday that two prominent dissidents arrested this week are suspected of endangering national security _ the clearest sign yet Chinese leaders plan to quash a would-be opposition party .\nOne leader of a suppressed new political party will be tried on Dec. 17 on a charge of colluding with foreign enemies of China '' to incite the subversion of state power , '' according to court documents given to his wife on Monday .\nWith attorneys locked up , harassed or plain scared , two prominent dissidents will defend themselves against charges of subversion Thursday in China 's highest-profile dissident trials in two years .\n"
references_2 = "Hurricane Mitch, category 5 hurricane, brought widespread death and destruction to Central American.\nEspecially hard hit was Honduras where an estimated 6,076 people lost their lives.\nThe hurricane, which lingered off the coast of Honduras for 3 days before moving off, flooded large areas, destroying crops and property.\nThe U.S. and European Union were joined by Pope John Paul II in a call for money and workers to help the stricken area.\nPresident Clinton sent Tipper Gore, wife of Vice President Gore to the area to deliver much needed supplies to the area, demonstrating U.S. commitment to the recovery of the region.\n"
all_hypothesis = [hypothesis_1, hypothesis_2]
all_references = [references_1, references_2]
scores = evaluator.get_scores(all_hypothesis, all_references)
for metric, results in sorted(scores.items(), key=lambda x: x[0]):
if not apply_avg and not apply_best: # value is a type of list as we evaluate each summary vs each reference
for hypothesis_id, results_per_ref in enumerate(results):
nb_references = len(results_per_ref['p'])
for reference_id in range(nb_references):
print('\tHypothesis #{} & Reference #{}: '.format(hypothesis_id, reference_id))
print('\t' + prepare_results(metric,results_per_ref['p'][reference_id], results_per_ref['r'][reference_id], results_per_ref['f'][reference_id]))
print()
else:
print(prepare_results(metric, results['p'], results['r'], results['f']))
print()
It produces the following output:
Evaluation with Avg
rouge-1: P: 28.62 R: 26.46 F1: 27.49
rouge-2: P: 4.21 R: 3.92 F1: 4.06
rouge-3: P: 0.80 R: 0.74 F1: 0.77
rouge-4: P: 0.00 R: 0.00 F1: 0.00
rouge-l: P: 30.52 R: 28.57 F1: 29.51
rouge-w: P: 15.85 R: 8.28 F1: 10.87
Evaluation with Best
rouge-1: P: 30.44 R: 28.36 F1: 29.37
rouge-2: P: 4.74 R: 4.46 F1: 4.59
rouge-3: P: 1.06 R: 0.98 F1: 1.02
rouge-4: P: 0.00 R: 0.00 F1: 0.00
rouge-l: P: 31.54 R: 29.71 F1: 30.60
rouge-w: P: 16.42 R: 8.82 F1: 11.47
Evaluation with Individual
Hypothesis #0 & Reference #0:
rouge-1: P: 38.54 R: 35.58 F1: 37.00
Hypothesis #0 & Reference #1:
rouge-1: P: 45.83 R: 43.14 F1: 44.44
Hypothesis #1 & Reference #0:
rouge-1: P: 15.05 R: 13.59 F1: 14.29
Hypothesis #0 & Reference #0:
rouge-2: P: 7.37 R: 6.80 F1: 7.07
Hypothesis #0 & Reference #1:
rouge-2: P: 9.47 R: 8.91 F1: 9.18
Hypothesis #1 & Reference #0:
rouge-2: P: 0.00 R: 0.00 F1: 0.00
Hypothesis #0 & Reference #0:
rouge-3: P: 2.13 R: 1.96 F1: 2.04
Hypothesis #0 & Reference #1:
rouge-3: P: 1.06 R: 1.00 F1: 1.03
Hypothesis #1 & Reference #0:
rouge-3: P: 0.00 R: 0.00 F1: 0.00
Hypothesis #0 & Reference #0:
rouge-4: P: 0.00 R: 0.00 F1: 0.00
Hypothesis #0 & Reference #1:
rouge-4: P: 0.00 R: 0.00 F1: 0.00
Hypothesis #1 & Reference #0:
rouge-4: P: 0.00 R: 0.00 F1: 0.00
Hypothesis #0 & Reference #0:
rouge-l: P: 42.11 R: 39.39 F1: 40.70
Hypothesis #0 & Reference #1:
rouge-l: P: 46.19 R: 43.92 F1: 45.03
Hypothesis #1 & Reference #0:
rouge-l: P: 16.88 R: 15.50 F1: 16.16
Hypothesis #0 & Reference #0:
rouge-w: P: 22.27 R: 11.49 F1: 15.16
Hypothesis #0 & Reference #1:
rouge-w: P: 24.56 R: 13.60 F1: 17.51
Hypothesis #1 & Reference #0:
rouge-w: P: 8.29 R: 4.04 F1: 5.43
Author: Diego999
Source Code: https://github.com/Diego999/py-rouge
License: Apache-2.0 license
1646753760
A new Cumulus-based Substrate node, ready for hacking :cloud:
This project is a fork of the Substrate Node Template modified to include dependencies required for registering this node as a parathread or parachain to an established relay chain.
👉 Learn more about parachains here, and parathreads here.
Follow these steps to prepare a local Substrate development environment :hammer_and_wrench:
If necessary, refer to the setup instructions at the Substrate Developer Hub.
Once the development environment is set up, build the Cumulus Parachain Template. This command will build the Wasm Runtime and native code:
cargo build --release
NOTE: In the following two sections, we document how to manually start a few relay chain nodes, start a parachain node (collator), and register the parachain with the relay chain.
We also have the
polkadot-launch
CLI tool that automate the following steps and help you easily launch relay chains and parachains. However it is still good to go through the following procedures once to understand the mechanism for running and registering a parachain.
To operate a parathread or parachain, you must connect to a relay chain. Typically you would test on a local Rococo development network, then move to the testnet, and finally launch on the mainnet. Keep in mind you need to configure the specific relay chain you will connect to in your collator chain_spec.rs
. In the following examples, we will use rococo-local
as the relay network.
Clone and build Polkadot (beware of the version tag we used):
# Get a fresh clone, or `cd` to where you have polkadot already:
git clone -b v0.9.7 --depth 1 https://github.com/paritytech/polkadot.git
cd polkadot
cargo build --release
First, we create the chain specification file (chainspec). Note the chainspec file must be generated on a single node and then shared among all nodes!
👉 Learn more about chain specification here.
./target/release/polkadot build-spec \
--chain rococo-local \
--raw \
--disable-default-bootnode \
> rococo_local.json
We need n + 1 full validator nodes running on a relay chain to accept n parachain / parathread connections. Here we will start two relay chain nodes so we can have one parachain node connecting in later.
From the Polkadot working directory:
# Start Relay `Alice` node
./target/release/polkadot \
--chain ./rococo_local.json \
-d /tmp/relay/alice \
--validator \
--alice \
--port 50555
Open a new terminal, same directory:
# Start Relay `Bob` node
./target/release/polkadot \
--chain ./rococo_local.json \
-d /tmp/relay/bob \
--validator \
--bob \
--port 50556
Add more nodes as needed, with non-conflicting ports, DB directories, and validator keys (--charlie
, --dave
, etc.).
To connect to a relay chain, you must first _reserve a ParaId
for your parathread that will become a parachain. To do this, you will need sufficient amount of currency on the network account to reserve the ID.
In this example, we will use Charlie
development account where we have funds available. Once you submit this extrinsic successfully, you can start your collators.
The easiest way to reserve your ParaId
is via Polkadot Apps UI under the Parachains
-> Parathreads
tab and use the + ParaID
button.
To operate your parachain, you need to specify the correct relay chain you will connect to in your collator chain_spec.rs
. Specifically you pass the command for the network you need in the Extensions
of your ChainSpec::from_genesis()
in the code.
Extensions {
relay_chain: "rococo-local".into(), // You MUST set this to the correct network!
para_id: id.into(),
},
You can choose from any pre-set runtime chainspec in the Polkadot repo, by referring to the
cli/src/command.rs
andnode/service/src/chain_spec.rs
files or generate your own and use that. See the Cumulus Workshop for how.
In the following examples, we will use the rococo-local
relay network we setup in the last section.
We first generate the genesis state and genesis wasm needed for the parachain registration.
# Build the parachain node (from it's top level dir)
cd substrate-parachain-template
cargo build --release
# Folder to store resource files needed for parachain registration
mkdir -p resources
# Build the chainspec
./target/release/parachain-collator build-spec \
--disable-default-bootnode > ./resources/template-local-plain.json
# Build the raw chainspec file
./target/release/parachain-collator build-spec \
--chain=./resources/template-local-plain.json \
--raw --disable-default-bootnode > ./resources/template-local-raw.json
# Export genesis state to `./resources`, using 2000 as the ParaId
./target/release/parachain-collator export-genesis-state --parachain-id 2000 > ./resources/para-2000-genesis
# Export the genesis wasm
./target/release/parachain-collator export-genesis-wasm > ./resources/para-2000-wasm
NOTE: we have set the
para_ID
to be 2000 here. This must be unique for all parathreads/chains on the relay chain you register with. You must reserve this first on the relay chain for the testnet or mainnet.
From the parachain template working directory:
# NOTE: this command assumes the chain spec is in a directory named `polkadot`
# that is at the same level of the template working directory. Change as needed.
#
# It also assumes a ParaId of 2000. Change as needed.
./target/release/parachain-collator \
-d /tmp/parachain/alice \
--collator \
--alice \
--force-authoring \
--ws-port 9945 \
--parachain-id 2000 \
-- \
--execution wasm \
--chain ../polkadot/rococo_local.json
Output:
2021-05-30 16:57:39 Parachain Collator Template
2021-05-30 16:57:39 ✌️ version 3.0.0-acce183-x86_64-linux-gnu
2021-05-30 16:57:39 ❤️ by Anonymous, 2017-2021
2021-05-30 16:57:39 📋 Chain specification: Local Testnet
2021-05-30 16:57:39 🏷 Node name: Alice
2021-05-30 16:57:39 👤 Role: AUTHORITY
2021-05-30 16:57:39 💾 Database: RocksDb at /tmp/parachain/alice/chains/local_testnet/db
2021-05-30 16:57:39 ⛓ Native runtime: template-parachain-1 (template-parachain-0.tx1.au1)
2021-05-30 16:57:41 Parachain id: Id(2000)
2021-05-30 16:57:41 Parachain Account: 5Ec4AhPUwPeyTFyuhGuBbD224mY85LKLMSqSSo33JYWCazU4
2021-05-30 16:57:41 Parachain genesis state: 0x0000000000000000000000000000000000000000000000000000000000000000000a96f42b5cb798190e5f679bb16970905087a9a9fc612fb5ca6b982b85783c0d03170a2e7597b7b7e3d84c05391d139a62b157e78786d8c082f29dcf4c11131400
2021-05-30 16:57:41 Is collating: yes
2021-05-30 16:57:41 [Parachain] 🔨 Initializing Genesis block/state (state: 0x0a96…3c0d, header-hash: 0xd42b…f271)
2021-05-30 16:57:41 [Parachain] ⏱ Loaded block-time = 12s from block 0xd42bb78354bc21770e3f0930ed45c7377558d2d8e81ca4d457e573128aabf271
2021-05-30 16:57:43 [Relaychain] 🔨 Initializing Genesis block/state (state: 0xace1…1b62, header-hash: 0xfa68…cf58)
2021-05-30 16:57:43 [Relaychain] 👴 Loading GRANDPA authority set from genesis on what appears to be first startup.
2021-05-30 16:57:44 [Relaychain] ⏱ Loaded block-time = 6s from block 0xfa68f5abd2a80394b87c9bd07e0f4eee781b8c696d0a22c8e5ba38ae10e1cf58
2021-05-30 16:57:44 [Relaychain] 👶 Creating empty BABE epoch changes on what appears to be first startup.
2021-05-30 16:57:44 [Relaychain] 🏷 Local node identity is: 12D3KooWBjYK2W4dsBfsrFA9tZCStb5ogPb6STQqi2AK9awXfXyG
2021-05-30 16:57:44 [Relaychain] 📦 Highest known block at #0
2021-05-30 16:57:44 [Relaychain] 〽️ Prometheus server started at 127.0.0.1:9616
2021-05-30 16:57:44 [Relaychain] Listening for new connections on 127.0.0.1:9945.
2021-05-30 16:57:44 [Parachain] Using default protocol ID "sup" because none is configured in the chain specs
2021-05-30 16:57:44 [Parachain] 🏷 Local node identity is: 12D3KooWADBSC58of6ng2M29YTDkmWCGehHoUZhsy9LGkHgYscBw
2021-05-30 16:57:44 [Parachain] 📦 Highest known block at #0
2021-05-30 16:57:44 [Parachain] Unable to listen on 127.0.0.1:9945
2021-05-30 16:57:44 [Parachain] Unable to bind RPC server to 127.0.0.1:9945. Trying random port.
2021-05-30 16:57:44 [Parachain] Listening for new connections on 127.0.0.1:45141.
2021-05-30 16:57:45 [Relaychain] 🔍 Discovered new external address for our node: /ip4/192.168.42.204/tcp/30334/ws/p2p/12D3KooWBjYK2W4dsBfsrFA9tZCStb5ogPb6STQqi2AK9awXfXyG
2021-05-30 16:57:45 [Parachain] 🔍 Discovered new external address for our node: /ip4/192.168.42.204/tcp/30333/p2p/12D3KooWADBSC58of6ng2M29YTDkmWCGehHoUZhsy9LGkHgYscBw
2021-05-30 16:57:48 [Relaychain] ✨ Imported #8 (0xe60b…9b0a)
2021-05-30 16:57:49 [Relaychain] 💤 Idle (2 peers), best: #8 (0xe60b…9b0a), finalized #5 (0x1e6f…567c), ⬇ 4.5kiB/s ⬆ 2.2kiB/s
2021-05-30 16:57:49 [Parachain] 💤 Idle (0 peers), best: #0 (0xd42b…f271), finalized #0 (0xd42b…f271), ⬇ 2.0kiB/s ⬆ 1.7kiB/s
2021-05-30 16:57:54 [Relaychain] ✨ Imported #9 (0x1af9…c9be)
2021-05-30 16:57:54 [Relaychain] ✨ Imported #9 (0x6ed8…fdf6)
2021-05-30 16:57:54 [Relaychain] 💤 Idle (2 peers), best: #9 (0x1af9…c9be), finalized #6 (0x3319…69a2), ⬇ 1.8kiB/s ⬆ 0.5kiB/s
2021-05-30 16:57:54 [Parachain] 💤 Idle (0 peers), best: #0 (0xd42b…f271), finalized #0 (0xd42b…f271), ⬇ 0.2kiB/s ⬆ 0.2kiB/s
2021-05-30 16:57:59 [Relaychain] 💤 Idle (2 peers), best: #9 (0x1af9…c9be), finalized #7 (0x5b50…1e5b), ⬇ 0.6kiB/s ⬆ 0.4kiB/s
2021-05-30 16:57:59 [Parachain] 💤 Idle (0 peers), best: #0 (0xd42b…f271), finalized #0 (0xd42b…f271), ⬇ 0 ⬆ 0
2021-05-30 16:58:00 [Relaychain] ✨ Imported #10 (0xc9c9…1ca3)
You see messages are from both a relaychain node and a parachain node. This is because a relay chain light client is also run next to the parachain collator.
Now that you have two relay chain nodes, and a parachain node accompanied with a relay chain light client running, the next step is to register the parachain in the relay chain with the following steps (for detail, refer to the Substrate Cumulus Worship):
Developer
-> sudo
page.paraSudoWrapper
-> sudoScheduleParaInitialize(id, genesis)
as the extrinsic type, shown below.id: ParaId
to 2,000 (or whatever ParaId you used above), and set the parachain: Bool
option to Yes.genesisHead
, drag the genesis state file exported above, para-2000-genesis
, in.validationCode
, drag the genesis wasm file exported above, para-2000-wasm
, in.Note: When registering to the public Rococo testnet, ensure you set a unique
paraId
larger than 1,000. Values below 1,000 are reserved exclusively for system parachains.
The collator node may need to be restarted to get it functioning as expected. After a new epoch starts on the relay chain, your parachain will come online. Once this happens, you should see the collator start reporting parachain blocks:
# Notice the relay epoch change! Only then do we start parachain collating!
#
2021-05-30 17:00:04 [Relaychain] 💤 Idle (2 peers), best: #30 (0xfc02…2a2a), finalized #28 (0x10ff…6539), ⬇ 1.0kiB/s ⬆ 0.3kiB/s
2021-05-30 17:00:04 [Parachain] 💤 Idle (0 peers), best: #0 (0xd42b…f271), finalized #0 (0xd42b…f271), ⬇ 0 ⬆ 0
2021-05-30 17:00:06 [Relaychain] 👶 New epoch 3 launching at block 0x68bc…0605 (block slot 270402601 >= start slot 270402601).
2021-05-30 17:00:06 [Relaychain] 👶 Next epoch starts at slot 270402611
2021-05-30 17:00:06 [Relaychain] ✨ Imported #31 (0x68bc…0605)
2021-05-30 17:00:06 [Parachain] Starting collation. relay_parent=0x68bcc93d24a31a2c89800a56c7a2b275fe9ca7bd63f829b64588ae0d99280605 at=0xd42bb78354bc21770e3f0930ed45c7377558d2d8e81ca4d457e573128aabf271
2021-05-30 17:00:06 [Parachain] 🙌 Starting consensus session on top of parent 0xd42bb78354bc21770e3f0930ed45c7377558d2d8e81ca4d457e573128aabf271
2021-05-30 17:00:06 [Parachain] 🎁 Prepared block for proposing at 1 [hash: 0xf6507812bf60bf53af1311f775aac03869be870df6b0406b2969784d0935cb92; parent_hash: 0xd42b…f271; extrinsics (2): [0x1bf5…1d76, 0x7c9b…4e23]]
2021-05-30 17:00:06 [Parachain] 🔖 Pre-sealed block for proposal at 1. Hash now 0x80fc151d7ccf228b802525022b6de257e42388ec7dc3c1dd7de491313650ccae, previously 0xf6507812bf60bf53af1311f775aac03869be870df6b0406b2969784d0935cb92.
2021-05-30 17:00:06 [Parachain] ✨ Imported #1 (0x80fc…ccae)
2021-05-30 17:00:06 [Parachain] Produced proof-of-validity candidate. block_hash=0x80fc151d7ccf228b802525022b6de257e42388ec7dc3c1dd7de491313650ccae
2021-05-30 17:00:09 [Relaychain] 💤 Idle (2 peers), best: #31 (0x68bc…0605), finalized #29 (0xa6fa…9e16), ⬇ 1.2kiB/s ⬆ 129.9kiB/s
2021-05-30 17:00:09 [Parachain] 💤 Idle (0 peers), best: #0 (0xd42b…f271), finalized #0 (0xd42b…f271), ⬇ 0 ⬆ 0
2021-05-30 17:00:12 [Relaychain] ✨ Imported #32 (0x5e92…ba30)
2021-05-30 17:00:12 [Relaychain] Moving approval window from session 0..=2 to 0..=3
2021-05-30 17:00:12 [Relaychain] ✨ Imported #32 (0x8144…74eb)
2021-05-30 17:00:14 [Relaychain] 💤 Idle (2 peers), best: #32 (0x5e92…ba30), finalized #29 (0xa6fa…9e16), ⬇ 1.4kiB/s ⬆ 0.2kiB/s
2021-05-30 17:00:14 [Parachain] 💤 Idle (0 peers), best: #0 (0xd42b…f271), finalized #0 (0xd42b…f271), ⬇ 0 ⬆ 0
2021-05-30 17:00:18 [Relaychain] ✨ Imported #33 (0x8c30…9ccd)
2021-05-30 17:00:18 [Parachain] Starting collation. relay_parent=0x8c30ce9e6e9867824eb2aff40148ac1ed64cf464f51c5f2574013b44b20f9ccd at=0x80fc151d7ccf228b802525022b6de257e42388ec7dc3c1dd7de491313650ccae
2021-05-30 17:00:19 [Relaychain] 💤 Idle (2 peers), best: #33 (0x8c30…9ccd), finalized #30 (0xfc02…2a2a), ⬇ 0.7kiB/s ⬆ 0.4kiB/s
2021-05-30 17:00:19 [Parachain] 💤 Idle (0 peers), best: #1 (0x80fc…ccae), finalized #0 (0xd42b…f271), ⬇ 0 ⬆ 0
2021-05-30 17:00:22 [Relaychain] 👴 Applying authority set change scheduled at block #31
2021-05-30 17:00:22 [Relaychain] 👴 Applying GRANDPA set change to new set [(Public(88dc3417d5058ec4b4503e0c12ea1a0a89be200fe98922423d4334014fa6b0ee (5FA9nQDV...)), 1), (Public(d17c2d7823ebf260fd138f2d7e27d114c0145d968b5ff5006125f2414fadae69 (5GoNkf6W...)), 1)]
2021-05-30 17:00:22 [Relaychain] 👴 Imported justification for block #31 that triggers command Changing authorities, signaling voter.
2021-05-30 17:00:24 [Relaychain] ✨ Imported #34 (0x211b…febf)
2021-05-30 17:00:24 [Parachain] Starting collation. relay_parent=0x211b3c53bebeff8af05e8f283d59fe171b7f91a5bf9c4669d88943f5a42bfebf at=0x80fc151d7ccf228b802525022b6de257e42388ec7dc3c1dd7de491313650ccae
2021-05-30 17:00:24 [Parachain] 🙌 Starting consensus session on top of parent 0x80fc151d7ccf228b802525022b6de257e42388ec7dc3c1dd7de491313650ccae
2021-05-30 17:00:24 [Parachain] 🎁 Prepared block for proposing at 2 [hash: 0x10fcb3180e966729c842d1b0c4d8d2c4028cfa8bef02b909af5ef787e6a6a694; parent_hash: 0x80fc…ccae; extrinsics (2): [0x4a6c…1fc6, 0x6b84…7cea]]
2021-05-30 17:00:24 [Parachain] 🔖 Pre-sealed block for proposal at 2. Hash now 0x5087fd06b1b73d90cfc3ad175df8495b378fffbb02fea212cc9e49a00fd8b5a0, previously 0x10fcb3180e966729c842d1b0c4d8d2c4028cfa8bef02b909af5ef787e6a6a694.
2021-05-30 17:00:24 [Parachain] ✨ Imported #2 (0x5087…b5a0)
2021-05-30 17:00:24 [Parachain] Produced proof-of-validity candidate. block_hash=0x5087fd06b1b73d90cfc3ad175df8495b378fffbb02fea212cc9e49a00fd8b5a0
2021-05-30 17:00:24 [Relaychain] 💤 Idle (2 peers), best: #34 (0x211b…febf), finalized #31 (0x68bc…0605), ⬇ 1.0kiB/s ⬆ 130.1kiB/s
2021-05-30 17:00:24 [Parachain] 💤 Idle (0 peers), best: #1 (0x80fc…ccae), finalized #0 (0xd42b…f271), ⬇ 0 ⬆ 0
2021-05-30 17:00:29 [Relaychain] 💤 Idle (2 peers), best: #34 (0x211b…febf), finalized #32 (0x5e92…ba30), ⬇ 0.2kiB/s ⬆ 0.1kiB/s
2021-05-30 17:00:29 [Parachain] 💤 Idle (0 peers), best: #1 (0x80fc…ccae), finalized #0 (0xd42b…f271), ⬇ 0 ⬆ 0
2021-05-30 17:00:30 [Relaychain] ✨ Imported #35 (0xee07…38a0)
2021-05-30 17:00:34 [Relaychain] 💤 Idle (2 peers), best: #35 (0xee07…38a0), finalized #33 (0x8c30…9ccd), ⬇ 0.9kiB/s ⬆ 0.3kiB/s
2021-05-30 17:00:34 [Parachain] 💤 Idle (0 peers), best: #1 (0x80fc…ccae), finalized #1 (0x80fc…ccae), ⬇ 0 ⬆ 0
2021-05-30 17:00:36 [Relaychain] ✨ Imported #36 (0xe8ce…4af6)
2021-05-30 17:00:36 [Parachain] Starting collation. relay_parent=0xe8cec8015c0c7bf508bf3f2f82b1696e9cca078e814b0f6671f0b0d5dfe84af6 at=0x5087fd06b1b73d90cfc3ad175df8495b378fffbb02fea212cc9e49a00fd8b5a0
2021-05-30 17:00:39 [Relaychain] 💤 Idle (2 peers), best: #36 (0xe8ce…4af6), finalized #33 (0x8c30…9ccd), ⬇ 0.6kiB/s ⬆ 0.1kiB/s
2021-05-30 17:00:39 [Parachain] 💤 Idle (0 peers), best: #2 (0x5087…b5a0), finalized #1 (0x80fc…ccae), ⬇ 0 ⬆ 0
Note the delay here! It may take some time for your relay chain to enter a new epoch.
Is this Cumulus Parachain Template Rococo & Westend testnets compatible? Yes!
See the Cumulus Workshop for the latest instructions to register a parathread/parachain on a relay chain.
NOTE: When running the relay chain and parachain, you must use the same tagged version of Polkadot and Cumulus so the collator would register successfully to the relay chain. You should test locally registering your parachain successfully before attempting to connect to any running relay chain network!
Find chainspec
files to connect to live networks here. You want to be sure to use the correct git release tag in these files, as they change from time to time and must match the live network!
These networks are under constant development - so please follow the progress and update of your parachains in lock step with the testnet changes if you wish to connect to the network. Do join the Parachain Technical matrix chat room to ask questions and connect with the parachain building teams.
Download Details:
Author: aresprotocols
Source Code: https://github.com/aresprotocols/substrate-parachain-template
License: Unlicense License
1648641360
A symbolic natural language parsing library for Rust, inspired by HDPSG.
This is a library for parsing natural or constructed languages into syntax trees and feature structures. There's no machine learning or probabilistic models, everything is hand-crafted and deterministic.
You can find out more about the motivations of this project in this blog post.
I'm using this to parse a constructed language for my upcoming xenolinguistics game, Themengi.
Using a simple 80-line grammar, introduced in the tutorial below, we can parse a simple subset of English, checking reflexive pronoun binding, case, and number agreement.
$ cargo run --bin cli examples/reflexives.fgr
> she likes himself
Parsed 0 trees
> her likes herself
Parsed 0 trees
> she like herself
Parsed 0 trees
> she likes herself
Parsed 1 tree
(0..3: S
(0..1: N (0..1: she))
(1..2: TV (1..2: likes))
(2..3: N (2..3: herself)))
[
child-2: [
case: acc
pron: ref
needs_pron: #0 she
num: sg
child-0: [ word: herself ]
]
child-1: [
tense: nonpast
child-0: [ word: likes ]
num: #1 sg
]
child-0: [
child-0: [ word: she ]
case: nom
pron: #0
num: #1
]
]
Low resource language? Low problem! No need to train on gigabytes of text, just write a grammar using your brain. Let's hypothesize that in American Sign Language, topicalized nouns (expressed with raised eyebrows) must appear first in the sentence. We can write a small grammar (18 lines), and plug in some sentences:
$ cargo run --bin cli examples/asl-wordorder.fgr -n
> boy sit
Parsed 1 tree
(0..2: S
(0..1: NP ((0..1: N (0..1: boy))))
(1..2: IV (1..2: sit)))
> boy throw ball
Parsed 1 tree
(0..3: S
(0..1: NP ((0..1: N (0..1: boy))))
(1..2: TV (1..2: throw))
(2..3: NP ((2..3: N (2..3: ball)))))
> ball nm-raised-eyebrows boy throw
Parsed 1 tree
(0..4: S
(0..2: NP
(0..1: N (0..1: ball))
(1..2: Topic (1..2: nm-raised-eyebrows)))
(2..3: NP ((2..3: N (2..3: boy))))
(3..4: TV (3..4: throw)))
> boy throw ball nm-raised-eyebrows
Parsed 0 trees
As an example, let's say we want to build a parser for English reflexive pronouns (himself, herself, themselves, themself, itself). We'll also support number ("He likes X" v.s. "They like X") and simple embedded clauses ("He said that they like X").
Grammar files are written in a custom language, similar to BNF, called Feature GRammar (.fgr). There's a VSCode syntax highlighting extension for these files available as fgr-syntax
.
We'll start by defining our lexicon. The lexicon is the set of terminal symbols (symbols in the actual input) that the grammar will match. Terminal symbols must start with a lowercase letter, and non-terminal symbols must start with an uppercase letter.
// pronouns
N -> he
N -> him
N -> himself
N -> she
N -> her
N -> herself
N -> they
N -> them
N -> themselves
N -> themself
// names, lowercase as they are terminals
N -> mary
N -> sue
N -> takeshi
N -> robert
// complementizer
Comp -> that
// verbs -- intransitive, transitive, and clausal
IV -> falls
IV -> fall
IV -> fell
TV -> likes
TV -> like
TV -> liked
CV -> says
CV -> say
CV -> said
Next, we can add our sentence rules (they must be added at the top, as the first rule in the file is assumed to be the top-level rule):
// sentence rules
S -> N IV
S -> N TV N
S -> N CV Comp S
// ... previous lexicon ...
Assuming this file is saved as examples/no-features.fgr
(which it is :wink:), we can test this file with the built-in CLI:
$ cargo run --bin cli examples/no-features.fgr
> he falls
Parsed 1 tree
(0..2: S
(0..1: N (0..1: he))
(1..2: IV (1..2: falls)))
[
child-1: [ child-0: [ word: falls ] ]
child-0: [ child-0: [ word: he ] ]
]
> he falls her
Parsed 0 trees
> he likes her
Parsed 1 tree
(0..3: S
(0..1: N (0..1: he))
(1..2: TV (1..2: likes))
(2..3: N (2..3: her)))
[
child-2: [ child-0: [ word: her ] ]
child-1: [ child-0: [ word: likes ] ]
child-0: [ child-0: [ word: he ] ]
]
> he likes
Parsed 0 trees
> he said that he likes her
Parsed 1 tree
(0..6: S
(0..1: N (0..1: he))
(1..2: CV (1..2: said))
(2..3: Comp (2..3: that))
(3..6: S
(3..4: N (3..4: he))
(4..5: TV (4..5: likes))
(5..6: N (5..6: her))))
[
child-0: [ child-0: [ word: he ] ]
child-2: [ child-0: [ word: that ] ]
child-1: [ child-0: [ word: said ] ]
child-3: [
child-2: [ child-0: [ word: her ] ]
child-1: [ child-0: [ word: likes ] ]
child-0: [ child-0: [ word: he ] ]
]
]
> he said that he
Parsed 0 trees
This grammar already parses some correct sentences, and blocks some trivially incorrect ones. However, it doesn't care about number, case, or reflexives right now:
> she likes himself // unbound reflexive pronoun
Parsed 1 tree
(0..3: S
(0..1: N (0..1: she))
(1..2: TV (1..2: likes))
(2..3: N (2..3: himself)))
[
child-0: [ child-0: [ word: she ] ]
child-2: [ child-0: [ word: himself ] ]
child-1: [ child-0: [ word: likes ] ]
]
> him like her // incorrect case on the subject pronoun, should be nominative
// (he) instead of accusative (him)
Parsed 1 tree
(0..3: S
(0..1: N (0..1: him))
(1..2: TV (1..2: like))
(2..3: N (2..3: her)))
[
child-0: [ child-0: [ word: him ] ]
child-1: [ child-0: [ word: like ] ]
child-2: [ child-0: [ word: her ] ]
]
> he like her // incorrect verb number agreement
Parsed 1 tree
(0..3: S
(0..1: N (0..1: he))
(1..2: TV (1..2: like))
(2..3: N (2..3: her)))
[
child-2: [ child-0: [ word: her ] ]
child-1: [ child-0: [ word: like ] ]
child-0: [ child-0: [ word: he ] ]
]
To fix this, we need to add features to our lexicon, and restrict the sentence rules based on features.
Features are added with square brackets, and are key: value pairs separated by commas. **top**
is a special feature value, which basically means "unspecified" -- we'll come back to it later. Features that are unspecified are also assumed to have a **top**
value, but sometimes explicitly stating top is more clear.
/// Pronouns
// The added features are:
// * num: sg or pl, whether this noun wants a singular verb (likes) or
// a plural verb (like). note this is grammatical number, so for example
// singular they takes plural agreement ("they like X", not *"they likes X")
// * case: nom or acc, whether this noun is nominative or accusative case.
// nominative case goes in the subject, and accusative in the object.
// e.g., "he fell" and "she likes him", not *"him fell" and *"her likes he"
// * pron: he, she, they, or ref -- what type of pronoun this is
// * needs_pron: whether this is a reflexive that needs to bind to another
// pronoun.
N[ num: sg, case: nom, pron: he ] -> he
N[ num: sg, case: acc, pron: he ] -> him
N[ num: sg, case: acc, pron: ref, needs_pron: he ] -> himself
N[ num: sg, case: nom, pron: she ] -> she
N[ num: sg, case: acc, pron: she ] -> her
N[ num: sg, case: acc, pron: ref, needs_pron: she] -> herself
N[ num: pl, case: nom, pron: they ] -> they
N[ num: pl, case: acc, pron: they ] -> them
N[ num: pl, case: acc, pron: ref, needs_pron: they ] -> themselves
N[ num: sg, case: acc, pron: ref, needs_pron: they ] -> themself
// Names
// The added features are:
// * num: sg, as people are singular ("mary likes her" / *"mary like her")
// * case: **top**, as names can be both subjects and objects
// ("mary likes her" / "she likes mary")
// * pron: whichever pronoun the person uses for reflexive agreement
// mary pron: she => mary likes herself
// sue pron: they => sue likes themself
// takeshi pron: he => takeshi likes himself
N[ num: sg, case: **top**, pron: she ] -> mary
N[ num: sg, case: **top**, pron: they ] -> sue
N[ num: sg, case: **top**, pron: he ] -> takeshi
N[ num: sg, case: **top**, pron: he ] -> robert
// Complementizer doesn't need features
Comp -> that
// Verbs -- intransitive, transitive, and clausal
// The added features are:
// * num: sg, pl, or **top** -- to match the noun numbers.
// **top** will match either sg or pl, as past-tense verbs in English
// don't agree in number: "he fell" and "they fell" are both fine
// * tense: past or nonpast -- this won't be used for agreement, but will be
// copied into the final feature structure, and the client code could do
// something with it
IV[ num: sg, tense: nonpast ] -> falls
IV[ num: pl, tense: nonpast ] -> fall
IV[ num: **top**, tense: past ] -> fell
TV[ num: sg, tense: nonpast ] -> likes
TV[ num: pl, tense: nonpast ] -> like
TV[ num: **top**, tense: past ] -> liked
CV[ num: sg, tense: nonpast ] -> says
CV[ num: pl, tense: nonpast ] -> say
CV[ num: **top**, tense: past ] -> said
Now that our lexicon is updated with features, we can update our sentence rules to constrain parsing based on those features. This uses two new features, tags and unification. Tags allow features to be associated between nodes in a rule, and unification controls how those features are compatible. The rules for unification are:
If unification fails anywhere, the parse is aborted and the tree is discarded. This allows the programmer to discard trees if features don't match.
// Sentence rules
// Intransitive verb:
// * Subject must be nominative case
// * Subject and verb must agree in number (copied through #1)
S -> N[ case: nom, num: #1 ] IV[ num: #1 ]
// Transitive verb:
// * Subject must be nominative case
// * Subject and verb must agree in number (copied through #2)
// * If there's a reflexive in the object position, make sure its `needs_pron`
// feature matches the subject's `pron` feature. If the object isn't a
// reflexive, then its `needs_pron` feature will implicitly be `**top**`, so
// will unify with anything.
S -> N[ case: nom, pron: #1, num: #2 ] TV[ num: #2 ] N[ case: acc, needs_pron: #1 ]
// Clausal verb:
// * Subject must be nominative case
// * Subject and verb must agree in number (copied through #1)
// * Reflexives can't cross clause boundaries (*"He said that she likes himself"),
// so we can ignore reflexives and delegate to inner clause rule
S -> N[ case: nom, num: #1 ] CV[ num: #1 ] Comp S
Now that we have this augmented grammar (available as examples/reflexives.fgr
), we can try it out and see that it rejects illicit sentences that were previously accepted, while still accepting valid ones:
> he fell
Parsed 1 tree
(0..2: S
(0..1: N (0..1: he))
(1..2: IV (1..2: fell)))
[
child-1: [
child-0: [ word: fell ]
num: #0 sg
tense: past
]
child-0: [
pron: he
case: nom
num: #0
child-0: [ word: he ]
]
]
> he like him
Parsed 0 trees
> he likes himself
Parsed 1 tree
(0..3: S
(0..1: N (0..1: he))
(1..2: TV (1..2: likes))
(2..3: N (2..3: himself)))
[
child-1: [
num: #0 sg
child-0: [ word: likes ]
tense: nonpast
]
child-2: [
needs_pron: #1 he
num: sg
child-0: [ word: himself ]
pron: ref
case: acc
]
child-0: [
child-0: [ word: he ]
pron: #1
num: #0
case: nom
]
]
> he likes herself
Parsed 0 trees
> mary likes herself
Parsed 1 tree
(0..3: S
(0..1: N (0..1: mary))
(1..2: TV (1..2: likes))
(2..3: N (2..3: herself)))
[
child-0: [
pron: #0 she
num: #1 sg
case: nom
child-0: [ word: mary ]
]
child-1: [
tense: nonpast
child-0: [ word: likes ]
num: #1
]
child-2: [
child-0: [ word: herself ]
num: sg
pron: ref
case: acc
needs_pron: #0
]
]
> mary likes themself
Parsed 0 trees
> sue likes themself
Parsed 1 tree
(0..3: S
(0..1: N (0..1: sue))
(1..2: TV (1..2: likes))
(2..3: N (2..3: themself)))
[
child-0: [
pron: #0 they
child-0: [ word: sue ]
case: nom
num: #1 sg
]
child-1: [
tense: nonpast
num: #1
child-0: [ word: likes ]
]
child-2: [
needs_pron: #0
case: acc
pron: ref
child-0: [ word: themself ]
num: sg
]
]
> sue likes himself
Parsed 0 trees
If this is interesting to you and you want to learn more, you can check out my blog series, the excellent textbook Syntactic Theory: A Formal Introduction (2nd ed.), and the DELPH-IN project, whose work on the LKB inspired this simplified version.
I need to write this section in more detail, but if you're comfortable with Rust, I suggest looking through the codebase. It's not perfect, it started as one of my first Rust projects (after migrating through F# -> TypeScript -> C in search of the right performance/ergonomics tradeoff), and it could use more tests, but overall it's not too bad.
Basically, the processing pipeline is:
Grammar
structGrammar
is defined in rules.rs
.Grammar
is Grammar::parse_from_file
, which is mostly a hand-written recusive descent parser in parse_grammar.rs
. Yes, I recognize the irony here.Grammar::parse
, which does everything for you, or Grammar::parse_chart
, which just does the chart)earley.rs
forest.rs
, using an algorithm I found in a very useful blog series I forget the URL for, because the algorithms in the academic literature for this are... weird.The most interesting thing you can do via code and not via the CLI is probably getting at the raw feature DAG, as that would let you do things like pronoun coreference. The DAG code is in featurestructure.rs
, and should be fairly approachable -- there's a lot of Rust ceremony around Rc<RefCell<...>>
because using an arena allocation crate seemed too harlike overkill, but that is somewhat mitigated by the NodeRef
type alias. Hit me up at https://vgel.me/contact if you need help with anything here!
Download Details:
Author: vgel
Source Code: https://github.com/vgel/treebender
License: MIT License
1648900800
I founded this project, because I wanted to publish the code I wrote in the last two years, when I tried to write enhanced checking for PostgreSQL upstream. It was not fully successful - integration into upstream requires some larger plpgsql refactoring - probably it will not be done in next years (now is Dec 2013). But written code is fully functional and can be used in production (and it is used in production). So, I created this extension to be available for all plpgsql developers.
If you like it and if you would to join to development of this extension, register yourself to postgresql extension hacking google group.
Features
I invite any ideas, patches, bugreports.
plpgsql_check is next generation of plpgsql_lint. It allows to check source code by explicit call plpgsql_check_function.
PostgreSQL PostgreSQL 10, 11, 12, 13 and 14 are supported.
The SQL statements inside PL/pgSQL functions are checked by validator for semantic errors. These errors can be found by plpgsql_check_function:
Active mode
postgres=# CREATE EXTENSION plpgsql_check;
LOAD
postgres=# CREATE TABLE t1(a int, b int);
CREATE TABLE
postgres=#
CREATE OR REPLACE FUNCTION public.f1()
RETURNS void
LANGUAGE plpgsql
AS $function$
DECLARE r record;
BEGIN
FOR r IN SELECT * FROM t1
LOOP
RAISE NOTICE '%', r.c; -- there is bug - table t1 missing "c" column
END LOOP;
END;
$function$;
CREATE FUNCTION
postgres=# select f1(); -- execution doesn't find a bug due to empty table t1
f1
────
(1 row)
postgres=# \x
Expanded display is on.
postgres=# select * from plpgsql_check_function_tb('f1()');
─[ RECORD 1 ]───────────────────────────
functionid │ f1
lineno │ 6
statement │ RAISE
sqlstate │ 42703
message │ record "r" has no field "c"
detail │ [null]
hint │ [null]
level │ error
position │ 0
query │ [null]
postgres=# \sf+ f1
CREATE OR REPLACE FUNCTION public.f1()
RETURNS void
LANGUAGE plpgsql
1 AS $function$
2 DECLARE r record;
3 BEGIN
4 FOR r IN SELECT * FROM t1
5 LOOP
6 RAISE NOTICE '%', r.c; -- there is bug - table t1 missing "c" column
7 END LOOP;
8 END;
9 $function$
Function plpgsql_check_function() has three possible formats: text, json or xml
select * from plpgsql_check_function('f1()', fatal_errors := false);
plpgsql_check_function
------------------------------------------------------------------------
error:42703:4:SQL statement:column "c" of relation "t1" does not exist
Query: update t1 set c = 30
-- ^
error:42P01:7:RAISE:missing FROM-clause entry for table "r"
Query: SELECT r.c
-- ^
error:42601:7:RAISE:too few parameters specified for RAISE
(7 rows)
postgres=# select * from plpgsql_check_function('fx()', format:='xml');
plpgsql_check_function
────────────────────────────────────────────────────────────────
<Function oid="16400"> ↵
<Issue> ↵
<Level>error</level> ↵
<Sqlstate>42P01</Sqlstate> ↵
<Message>relation "foo111" does not exist</Message> ↵
<Stmt lineno="3">RETURN</Stmt> ↵
<Query position="23">SELECT (select a from foo111)</Query>↵
</Issue> ↵
</Function>
(1 row)
You can set level of warnings via function's parameters:
'fx()'::regprocedure
or 16799::regprocedure
. Possible alternative is using a name only, when function's name is unique - like 'fx'
. When the name is not unique or the function doesn't exists it raises a error.relid DEFAULT 0
- oid of relation assigned with trigger function. It is necessary for check of any trigger function.
fatal_errors boolean DEFAULT true
- stop on first error
other_warnings boolean DEFAULT true
- show warnings like different attributes number in assignmenet on left and right side, variable overlaps function's parameter, unused variables, unwanted casting, ..
extra_warnings boolean DEFAULT true
- show warnings like missing RETURN
, shadowed variables, dead code, never read (unused) function's parameter, unmodified variables, modified auto variables, ..
performance_warnings boolean DEFAULT false
- performance related warnings like declared type with type modificator, casting, implicit casts in where clause (can be reason why index is not used), ..
security_warnings boolean DEFAULT false
- security related checks like SQL injection vulnerability detection
anyelementtype regtype DEFAULT 'int'
- a real type used instead anyelement type
anyenumtype regtype DEFAULT '-'
- a real type used instead anyenum type
anyrangetype regtype DEFAULT 'int4range'
- a real type used instead anyrange type
anycompatibletype DEFAULT 'int'
- a real type used instead anycompatible type
anycompatiblerangetype DEFAULT 'int4range'
- a real type used instead anycompatible range type
without_warnings DEFAULT false
- disable all warnings
all_warnings DEFAULT false
- enable all warnings
newtable DEFAULT NULL
, oldtable DEFAULT NULL
- the names of NEW or OLD transitive tables. These parameters are required when transitive tables are used.
When you want to check any trigger, you have to enter a relation that will be used together with trigger function
CREATE TABLE bar(a int, b int);
postgres=# \sf+ foo_trg
CREATE OR REPLACE FUNCTION public.foo_trg()
RETURNS trigger
LANGUAGE plpgsql
1 AS $function$
2 BEGIN
3 NEW.c := NEW.a + NEW.b;
4 RETURN NEW;
5 END;
6 $function$
Missing relation specification
postgres=# select * from plpgsql_check_function('foo_trg()');
ERROR: missing trigger relation
HINT: Trigger relation oid must be valid
Correct trigger checking (with specified relation)
postgres=# select * from plpgsql_check_function('foo_trg()', 'bar');
plpgsql_check_function
--------------------------------------------------------
error:42703:3:assignment:record "new" has no field "c"
(1 row)
For triggers with transitive tables you can set a oldtable
or newtable
parameters:
create or replace function footab_trig_func()
returns trigger as $$
declare x int;
begin
if false then
-- should be ok;
select count(*) from newtab into x;
-- should fail;
select count(*) from newtab where d = 10 into x;
end if;
return null;
end;
$$ language plpgsql;
select * from plpgsql_check_function('footab_trig_func','footab', newtable := 'newtab');
You can use the plpgsql_check_function for mass check functions and mass check triggers. Please, test following queries:
-- check all nontrigger plpgsql functions
SELECT p.oid, p.proname, plpgsql_check_function(p.oid)
FROM pg_catalog.pg_namespace n
JOIN pg_catalog.pg_proc p ON pronamespace = n.oid
JOIN pg_catalog.pg_language l ON p.prolang = l.oid
WHERE l.lanname = 'plpgsql' AND p.prorettype <> 2279;
or
SELECT p.proname, tgrelid::regclass, cf.*
FROM pg_proc p
JOIN pg_trigger t ON t.tgfoid = p.oid
JOIN pg_language l ON p.prolang = l.oid
JOIN pg_namespace n ON p.pronamespace = n.oid,
LATERAL plpgsql_check_function(p.oid, t.tgrelid) cf
WHERE n.nspname = 'public' and l.lanname = 'plpgsql'
or
-- check all plpgsql functions (functions or trigger functions with defined triggers)
SELECT
(pcf).functionid::regprocedure, (pcf).lineno, (pcf).statement,
(pcf).sqlstate, (pcf).message, (pcf).detail, (pcf).hint, (pcf).level,
(pcf)."position", (pcf).query, (pcf).context
FROM
(
SELECT
plpgsql_check_function_tb(pg_proc.oid, COALESCE(pg_trigger.tgrelid, 0)) AS pcf
FROM pg_proc
LEFT JOIN pg_trigger
ON (pg_trigger.tgfoid = pg_proc.oid)
WHERE
prolang = (SELECT lang.oid FROM pg_language lang WHERE lang.lanname = 'plpgsql') AND
pronamespace <> (SELECT nsp.oid FROM pg_namespace nsp WHERE nsp.nspname = 'pg_catalog') AND
-- ignore unused triggers
(pg_proc.prorettype <> (SELECT typ.oid FROM pg_type typ WHERE typ.typname = 'trigger') OR
pg_trigger.tgfoid IS NOT NULL)
OFFSET 0
) ss
ORDER BY (pcf).functionid::regprocedure::text, (pcf).lineno
Passive mode
Functions should be checked on start - plpgsql_check module must be loaded.
plpgsql_check.mode = [ disabled | by_function | fresh_start | every_start ]
plpgsql_check.fatal_errors = [ yes | no ]
plpgsql_check.show_nonperformance_warnings = false
plpgsql_check.show_performance_warnings = false
Default mode is by_function, that means that the enhanced check is done only in active mode - by plpgsql_check_function. fresh_start
means cold start.
You can enable passive mode by
load 'plpgsql'; -- 1.1 and higher doesn't need it
load 'plpgsql_check';
set plpgsql_check.mode = 'every_start';
SELECT fx(10); -- run functions - function is checked before runtime starts it
Limits
plpgsql_check should find almost all errors on really static code. When developer use some PLpgSQL's dynamic features like dynamic SQL or record data type, then false positives are possible. These should be rare - in well written code - and then the affected function should be redesigned or plpgsql_check should be disabled for this function.
CREATE OR REPLACE FUNCTION f1()
RETURNS void AS $$
DECLARE r record;
BEGIN
FOR r IN EXECUTE 'SELECT * FROM t1'
LOOP
RAISE NOTICE '%', r.c;
END LOOP;
END;
$$ LANGUAGE plpgsql SET plpgsql.enable_check TO false;
A usage of plpgsql_check adds a small overhead (in enabled passive mode) and you should use it only in develop or preprod environments.
This module doesn't check queries that are assembled in runtime. It is not possible to identify results of dynamic queries - so plpgsql_check cannot to set correct type to record variables and cannot to check a dependent SQLs and expressions.
When type of record's variable is not know, you can assign it explicitly with pragma type
:
DECLARE r record;
BEGIN
EXECUTE format('SELECT * FROM %I', _tablename) INTO r;
PERFORM plpgsql_check_pragma('type: r (id int, processed bool)');
IF NOT r.processed THEN
...
Attention: The SQL injection check can detect only some SQL injection vulnerabilities. This tool cannot be used for security audit! Some issues should not be detected. This check can raise false alarms too - probably when variable is sanitized by other command or when value is of some compose type.
plpgsql_check should not to detect structure of referenced cursors. A reference on cursor in PLpgSQL is implemented as name of global cursor. In check time, the name is not known (not in all possibilities), and global cursor doesn't exist. It is significant break for any static analyse. PLpgSQL cannot to set correct type for record variables and cannot to check a dependent SQLs and expressions. A solution is same like dynamic SQL. Don't use record variable as target when you use refcursor type or disable plpgsql_check for these functions.
CREATE OR REPLACE FUNCTION foo(refcur_var refcursor)
RETURNS void AS $$
DECLARE
rec_var record;
BEGIN
FETCH refcur_var INTO rec_var; -- this is STOP for plpgsql_check
RAISE NOTICE '%', rec_var; -- record rec_var is not assigned yet error
In this case a record type should not be used (use known rowtype instead):
CREATE OR REPLACE FUNCTION foo(refcur_var refcursor)
RETURNS void AS $$
DECLARE
rec_var some_rowtype;
BEGIN
FETCH refcur_var INTO rec_var;
RAISE NOTICE '%', rec_var;
plpgsql_check cannot verify queries over temporary tables that are created in plpgsql's function runtime. For this use case it is necessary to create a fake temp table or disable plpgsql_check for this function.
In reality temp tables are stored in own (per user) schema with higher priority than persistent tables. So you can do (with following trick safetly):
CREATE OR REPLACE FUNCTION public.disable_dml()
RETURNS trigger
LANGUAGE plpgsql AS $function$
BEGIN
RAISE EXCEPTION SQLSTATE '42P01'
USING message = format('this instance of %I table doesn''t allow any DML operation', TG_TABLE_NAME),
hint = format('you should to run "CREATE TEMP TABLE %1$I(LIKE %1$I INCLUDING ALL);" statement',
TG_TABLE_NAME);
RETURN NULL;
END;
$function$;
CREATE TABLE foo(a int, b int); -- doesn't hold data ever
CREATE TRIGGER foo_disable_dml
BEFORE INSERT OR UPDATE OR DELETE ON foo
EXECUTE PROCEDURE disable_dml();
postgres=# INSERT INTO foo VALUES(10,20);
ERROR: this instance of foo table doesn't allow any DML operation
HINT: you should to run "CREATE TEMP TABLE foo(LIKE foo INCLUDING ALL);" statement
postgres=#
CREATE TABLE
postgres=# INSERT INTO foo VALUES(10,20);
INSERT 0 1
This trick emulates GLOBAL TEMP tables partially and it allows a statical validation. Other possibility is using a [template foreign data wrapper] (https://github.com/okbob/template_fdw)
You can use pragma table
and create ephemeral table:
BEGIN
CREATE TEMP TABLE xxx(a int);
PERFORM plpgsql_check_pragma('table: xxx(a int)');
INSERT INTO xxx VALUES(10);
Dependency list
A function plpgsql_show_dependency_tb can show all functions, operators and relations used inside processed function:
postgres=# select * from plpgsql_show_dependency_tb('testfunc(int,float)');
┌──────────┬───────┬────────┬─────────┬────────────────────────────┐
│ type │ oid │ schema │ name │ params │
╞══════════╪═══════╪════════╪═════════╪════════════════════════════╡
│ FUNCTION │ 36008 │ public │ myfunc1 │ (integer,double precision) │
│ FUNCTION │ 35999 │ public │ myfunc2 │ (integer,double precision) │
│ OPERATOR │ 36007 │ public │ ** │ (integer,integer) │
│ RELATION │ 36005 │ public │ myview │ │
│ RELATION │ 36002 │ public │ mytable │ │
└──────────┴───────┴────────┴─────────┴────────────────────────────┘
(4 rows)
Profiler
The plpgsql_check contains simple profiler of plpgsql functions and procedures. It can work with/without a access to shared memory. It depends on shared_preload_libraries
config. When plpgsql_check was initialized by shared_preload_libraries
, then it can allocate shared memory, and function's profiles are stored there. When plpgsql_check cannot to allocate shared momory, the profile is stored in session memory.
Due dependencies, shared_preload_libraries
should to contains plpgsql
first
postgres=# show shared_preload_libraries ;
┌──────────────────────────┐
│ shared_preload_libraries │
╞══════════════════════════╡
│ plpgsql,plpgsql_check │
└──────────────────────────┘
(1 row)
The profiler is active when GUC plpgsql_check.profiler
is on. The profiler doesn't require shared memory, but if there are not shared memory, then the profile is limmitted just to active session.
When plpgsql_check is initialized by shared_preload_libraries
, another GUC is available to configure the amount of shared memory used by the profiler: plpgsql_check.profiler_max_shared_chunks
. This defines the maximum number of statements chunk that can be stored in shared memory. For each plpgsql function (or procedure), the whole content is split into chunks of 30 statements. If needed, multiple chunks can be used to store the whole content of a single function. A single chunk is 1704 bytes. The default value for this GUC is 15000, which should be enough for big projects containing hundred of thousands of statements in plpgsql, and will consume about 24MB of memory. If your project doesn't require that much number of chunks, you can set this parameter to a smaller number in order to decrease the memory usage. The minimum value is 50 (which should consume about 83kB of memory), and the maximum value is 100000 (which should consume about 163MB of memory). Changing this parameter requires a PostgreSQL restart.
The profiler will also retrieve the query identifier for each instruction that contains an expression or optimizable statement. Note that this requires pg_stat_statements, or another similar third-party extension), to be installed. There are some limitations to the query identifier retrieval:
Attention: A update of shared profiles can decrease performance on servers under higher load.
The profile can be displayed by function plpgsql_profiler_function_tb
:
postgres=# select lineno, avg_time, source from plpgsql_profiler_function_tb('fx(int)');
┌────────┬──────────┬───────────────────────────────────────────────────────────────────┐
│ lineno │ avg_time │ source │
╞════════╪══════════╪═══════════════════════════════════════════════════════════════════╡
│ 1 │ │ │
│ 2 │ │ declare result int = 0; │
│ 3 │ 0.075 │ begin │
│ 4 │ 0.202 │ for i in 1..$1 loop │
│ 5 │ 0.005 │ select result + i into result; select result + i into result; │
│ 6 │ │ end loop; │
│ 7 │ 0 │ return result; │
│ 8 │ │ end; │
└────────┴──────────┴───────────────────────────────────────────────────────────────────┘
(9 rows)
The profile per statements (not per line) can be displayed by function plpgsql_profiler_function_statements_tb:
CREATE OR REPLACE FUNCTION public.fx1(a integer)
RETURNS integer
LANGUAGE plpgsql
1 AS $function$
2 begin
3 if a > 10 then
4 raise notice 'ahoj';
5 return -1;
6 else
7 raise notice 'nazdar';
8 return 1;
9 end if;
10 end;
11 $function$
postgres=# select stmtid, parent_stmtid, parent_note, lineno, exec_stmts, stmtname
from plpgsql_profiler_function_statements_tb('fx1');
┌────────┬───────────────┬─────────────┬────────┬────────────┬─────────────────┐
│ stmtid │ parent_stmtid │ parent_note │ lineno │ exec_stmts │ stmtname │
╞════════╪═══════════════╪═════════════╪════════╪════════════╪═════════════════╡
│ 0 │ ∅ │ ∅ │ 2 │ 0 │ statement block │
│ 1 │ 0 │ body │ 3 │ 0 │ IF │
│ 2 │ 1 │ then body │ 4 │ 0 │ RAISE │
│ 3 │ 1 │ then body │ 5 │ 0 │ RETURN │
│ 4 │ 1 │ else body │ 7 │ 0 │ RAISE │
│ 5 │ 1 │ else body │ 8 │ 0 │ RETURN │
└────────┴───────────────┴─────────────┴────────┴────────────┴─────────────────┘
(6 rows)
All stored profiles can be displayed by calling function plpgsql_profiler_functions_all
:
postgres=# select * from plpgsql_profiler_functions_all();
┌───────────────────────┬────────────┬────────────┬──────────┬─────────────┬──────────┬──────────┐
│ funcoid │ exec_count │ total_time │ avg_time │ stddev_time │ min_time │ max_time │
╞═══════════════════════╪════════════╪════════════╪══════════╪═════════════╪══════════╪══════════╡
│ fxx(double precision) │ 1 │ 0.01 │ 0.01 │ 0.00 │ 0.01 │ 0.01 │
└───────────────────────┴────────────┴────────────┴──────────┴─────────────┴──────────┴──────────┘
(1 row)
There are two functions for cleaning stored profiles: plpgsql_profiler_reset_all()
and plpgsql_profiler_reset(regprocedure)
.
plpgsql_check provides two functions:
plpgsql_coverage_statements(name)
plpgsql_coverage_branches(name)
There is another very good PLpgSQL profiler - https://bitbucket.org/openscg/plprofiler
My extension is designed to be simple for use and practical. Nothing more or less.
plprofiler is more complex. It build call graphs and from this graph it can creates flame graph of execution times.
Both extensions can be used together with buildin PostgreSQL's feature - tracking functions.
set track_functions to 'pl';
...
select * from pg_stat_user_functions;
Tracer
plpgsql_check provides a tracing possibility - in this mode you can see notices on start or end functions (terse and default verbosity) and start or end statements (verbose verbosity). For default and verbose verbosity the content of function arguments is displayed. The content of related variables are displayed when verbosity is verbose.
postgres=# do $$ begin perform fx(10,null, 'now', e'stěhule'); end; $$;
NOTICE: #0 ->> start of inline_code_block (Oid=0)
NOTICE: #2 ->> start of function fx(integer,integer,date,text) (Oid=16405)
NOTICE: #2 call by inline_code_block line 1 at PERFORM
NOTICE: #2 "a" => '10', "b" => null, "c" => '2020-08-03', "d" => 'stěhule'
NOTICE: #4 ->> start of function fx(integer) (Oid=16404)
NOTICE: #4 call by fx(integer,integer,date,text) line 1 at PERFORM
NOTICE: #4 "a" => '10'
NOTICE: #4 <<- end of function fx (elapsed time=0.098 ms)
NOTICE: #2 <<- end of function fx (elapsed time=0.399 ms)
NOTICE: #0 <<- end of block (elapsed time=0.754 ms)
The number after #
is a execution frame counter (this number is related to deep of error context stack). It allows to pair start end and of function.
Tracing is enabled by setting plpgsql_check.tracer
to on
. Attention - enabling this behaviour has significant negative impact on performance (unlike the profiler). You can set a level for output used by tracer plpgsql_check.tracer_errlevel
(default is notice
). The output content is limited by length specified by plpgsql_check.tracer_variable_max_length
configuration variable.
In terse verbose mode the output is reduced:
postgres=# set plpgsql_check.tracer_verbosity TO terse;
SET
postgres=# do $$ begin perform fx(10,null, 'now', e'stěhule'); end; $$;
NOTICE: #0 start of inline code block (oid=0)
NOTICE: #2 start of fx (oid=16405)
NOTICE: #4 start of fx (oid=16404)
NOTICE: #4 end of fx
NOTICE: #2 end of fx
NOTICE: #0 end of inline code block
In verbose mode the output is extended about statement details:
postgres=# do $$ begin perform fx(10,null, 'now', e'stěhule'); end; $$;
NOTICE: #0 ->> start of block inline_code_block (oid=0)
NOTICE: #0.1 1 --> start of PERFORM
NOTICE: #2 ->> start of function fx(integer,integer,date,text) (oid=16405)
NOTICE: #2 call by inline_code_block line 1 at PERFORM
NOTICE: #2 "a" => '10', "b" => null, "c" => '2020-08-04', "d" => 'stěhule'
NOTICE: #2.1 1 --> start of PERFORM
NOTICE: #2.1 "a" => '10'
NOTICE: #4 ->> start of function fx(integer) (oid=16404)
NOTICE: #4 call by fx(integer,integer,date,text) line 1 at PERFORM
NOTICE: #4 "a" => '10'
NOTICE: #4.1 6 --> start of assignment
NOTICE: #4.1 "a" => '10', "b" => '20'
NOTICE: #4.1 <-- end of assignment (elapsed time=0.076 ms)
NOTICE: #4.1 "res" => '130'
NOTICE: #4.2 7 --> start of RETURN
NOTICE: #4.2 "res" => '130'
NOTICE: #4.2 <-- end of RETURN (elapsed time=0.054 ms)
NOTICE: #4 <<- end of function fx (elapsed time=0.373 ms)
NOTICE: #2.1 <-- end of PERFORM (elapsed time=0.589 ms)
NOTICE: #2 <<- end of function fx (elapsed time=0.727 ms)
NOTICE: #0.1 <-- end of PERFORM (elapsed time=1.147 ms)
NOTICE: #0 <<- end of block (elapsed time=1.286 ms)
Special feature of tracer is tracing of ASSERT
statement when plpgsql_check.trace_assert
is on
. When plpgsql_check.trace_assert_verbosity
is DEFAULT
, then all function's or procedure's variables are displayed when assert expression is false. When this configuration is VERBOSE
then all variables from all plpgsql frames are displayed. This behaviour is independent on plpgsql.check_asserts
value. It can be used, although the assertions are disabled in plpgsql runtime.
postgres=# set plpgsql_check.tracer to off;
postgres=# set plpgsql_check.trace_assert_verbosity TO verbose;
postgres=# do $$ begin perform fx(10,null, 'now', e'stěhule'); end; $$;
NOTICE: #4 PLpgSQL assert expression (false) on line 12 of fx(integer) is false
NOTICE: "a" => '10', "res" => null, "b" => '20'
NOTICE: #2 PL/pgSQL function fx(integer,integer,date,text) line 1 at PERFORM
NOTICE: "a" => '10', "b" => null, "c" => '2020-08-05', "d" => 'stěhule'
NOTICE: #0 PL/pgSQL function inline_code_block line 1 at PERFORM
ERROR: assertion failed
CONTEXT: PL/pgSQL function fx(integer) line 12 at ASSERT
SQL statement "SELECT fx(a)"
PL/pgSQL function fx(integer,integer,date,text) line 1 at PERFORM
SQL statement "SELECT fx(10,null, 'now', e'stěhule')"
PL/pgSQL function inline_code_block line 1 at PERFORM
postgres=# set plpgsql.check_asserts to off;
SET
postgres=# do $$ begin perform fx(10,null, 'now', e'stěhule'); end; $$;
NOTICE: #4 PLpgSQL assert expression (false) on line 12 of fx(integer) is false
NOTICE: "a" => '10', "res" => null, "b" => '20'
NOTICE: #2 PL/pgSQL function fx(integer,integer,date,text) line 1 at PERFORM
NOTICE: "a" => '10', "b" => null, "c" => '2020-08-05', "d" => 'stěhule'
NOTICE: #0 PL/pgSQL function inline_code_block line 1 at PERFORM
DO
Tracer prints content of variables or function arguments. For security definer function, this content can hold security sensitive data. This is reason why tracer is disabled by default and should be enabled only with super user rights plpgsql_check.enable_tracer
.
Pragma
You can configure plpgsql_check behave inside checked function with "pragma" function. This is a analogy of PL/SQL or ADA language of PRAGMA feature. PLpgSQL doesn't support PRAGMA, but plpgsql_check detects function named plpgsql_check_pragma
and get options from parameters of this function. These plpgsql_check options are valid to end of group of statements.
CREATE OR REPLACE FUNCTION test()
RETURNS void AS $$
BEGIN
...
-- for following statements disable check
PERFORM plpgsql_check_pragma('disable:check');
...
-- enable check again
PERFORM plpgsql_check_pragma('enable:check');
...
END;
$$ LANGUAGE plpgsql;
The function plpgsql_check_pragma
is immutable function that returns one. It is defined by plpgsql_check
extension. You can declare alternative plpgsql_check_pragma
function like:
CREATE OR REPLACE FUNCTION plpgsql_check_pragma(VARIADIC args[])
RETURNS int AS $$
SELECT 1
$$ LANGUAGE sql IMMUTABLE;
Using pragma function in declaration part of top block sets options on function level too.
CREATE OR REPLACE FUNCTION test()
RETURNS void AS $$
DECLARE
aux int := plpgsql_check_pragma('disable:extra_warnings');
...
Shorter syntax for pragma is supported too:
CREATE OR REPLACE FUNCTION test()
RETURNS void AS $$
DECLARE r record;
BEGIN
PERFORM 'PRAGMA:TYPE:r (a int, b int)';
PERFORM 'PRAGMA:TABLE: x (like pg_class)';
...
echo:str
- print string (for testing)
status:check
,status:tracer
, status:other_warnings
, status:performance_warnings
, status:extra_warnings
,status:security_warnings
enable:check
,enable:tracer
, enable:other_warnings
, enable:performance_warnings
, enable:extra_warnings
,enable:security_warnings
disable:check
,disable:tracer
, disable:other_warnings
, disable:performance_warnings
, disable:extra_warnings
,disable:security_warnings
type:varname typename
or type:varname (fieldname type, ...)
- set type to variable of record type
table: name (column_name type, ...)
or table: name (like tablename)
- create ephereal table
Pragmas enable:tracer
and disable:tracer
are active for Postgres 12 and higher
Compilation
You need a development environment for PostgreSQL extensions:
make clean
make install
result:
[pavel@localhost plpgsql_check]$ make USE_PGXS=1 clean
rm -f plpgsql_check.so libplpgsql_check.a libplpgsql_check.pc
rm -f plpgsql_check.o
rm -rf results/ regression.diffs regression.out tmp_check/ log/
[pavel@localhost plpgsql_check]$ make USE_PGXS=1 all
clang -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fpic -I/usr/local/pgsql/lib/pgxs/src/makefiles/../../src/pl/plpgsql/src -I. -I./ -I/usr/local/pgsql/include/server -I/usr/local/pgsql/include/internal -D_GNU_SOURCE -c -o plpgsql_check.o plpgsql_check.c
clang -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fpic -I/usr/local/pgsql/lib/pgxs/src/makefiles/../../src/pl/plpgsql/src -shared -o plpgsql_check.so plpgsql_check.o -L/usr/local/pgsql/lib -Wl,--as-needed -Wl,-rpath,'/usr/local/pgsql/lib',--enable-new-dtags
[pavel@localhost plpgsql_check]$ su root
Password: *******
[root@localhost plpgsql_check]# make USE_PGXS=1 install
/usr/bin/mkdir -p '/usr/local/pgsql/lib'
/usr/bin/mkdir -p '/usr/local/pgsql/share/extension'
/usr/bin/mkdir -p '/usr/local/pgsql/share/extension'
/usr/bin/install -c -m 755 plpgsql_check.so '/usr/local/pgsql/lib/plpgsql_check.so'
/usr/bin/install -c -m 644 plpgsql_check.control '/usr/local/pgsql/share/extension/'
/usr/bin/install -c -m 644 plpgsql_check--0.9.sql '/usr/local/pgsql/share/extension/'
[root@localhost plpgsql_check]# exit
[pavel@localhost plpgsql_check]$ make USE_PGXS=1 installcheck
/usr/local/pgsql/lib/pgxs/src/makefiles/../../src/test/regress/pg_regress --inputdir=./ --psqldir='/usr/local/pgsql/bin' --dbname=pl_regression --load-language=plpgsql --dbname=contrib_regression plpgsql_check_passive plpgsql_check_active plpgsql_check_active-9.5
(using postmaster on Unix socket, default port)
============== dropping database "contrib_regression" ==============
DROP DATABASE
============== creating database "contrib_regression" ==============
CREATE DATABASE
ALTER DATABASE
============== installing plpgsql ==============
CREATE LANGUAGE
============== running regression test queries ==============
test plpgsql_check_passive ... ok
test plpgsql_check_active ... ok
test plpgsql_check_active-9.5 ... ok
=====================
All 3 tests passed.
=====================
Sometimes successful compilation can require libicu-dev package (PostgreSQL 10 and higher - when pg was compiled with ICU support)
sudo apt install libicu-dev
You can check precompiled dll libraries http://okbob.blogspot.cz/2015/02/plpgsqlcheck-is-available-for-microsoft.html
or compile by self:
plpgsql_check.dll
to PostgreSQL\14\lib
plpgsql_check.control
and plpgsql_check--2.1.sql
to PostgreSQL\14\share\extension
Compilation against PostgreSQL 10 requires libICU!
Licence
Copyright (c) Pavel Stehule (pavel.stehule@gmail.com)
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Note
If you like it, send a postcard to address
Pavel Stehule
Skalice 12
256 01 Benesov u Prahy
Czech Republic
I invite any questions, comments, bug reports, patches on mail address pavel.stehule@gmail.com
Author: okbob
Source Code: https://github.com/okbob/plpgsql_check
License: View license
1648803600
I founded this project, because I wanted to publish the code I wrote in the last two years, when I tried to write enhanced checking for PostgreSQL upstream. It was not fully successful - integration into upstream requires some larger plpgsql refactoring - probably it will not be done in next years (now is Dec 2013). But written code is fully functional and can be used in production (and it is used in production). So, I created this extension to be available for all plpgsql developers.
If you like it and if you would to join to development of this extension, register yourself to postgresql extension hacking google group.
Features
I invite any ideas, patches, bugreports.
plpgsql_check is next generation of plpgsql_lint. It allows to check source code by explicit call plpgsql_check_function.
PostgreSQL PostgreSQL 10, 11, 12, 13 and 14 are supported.
The SQL statements inside PL/pgSQL functions are checked by validator for semantic errors. These errors can be found by plpgsql_check_function:
Active mode
postgres=# CREATE EXTENSION plpgsql_check;
LOAD
postgres=# CREATE TABLE t1(a int, b int);
CREATE TABLE
postgres=#
CREATE OR REPLACE FUNCTION public.f1()
RETURNS void
LANGUAGE plpgsql
AS $function$
DECLARE r record;
BEGIN
FOR r IN SELECT * FROM t1
LOOP
RAISE NOTICE '%', r.c; -- there is bug - table t1 missing "c" column
END LOOP;
END;
$function$;
CREATE FUNCTION
postgres=# select f1(); -- execution doesn't find a bug due to empty table t1
f1
────
(1 row)
postgres=# \x
Expanded display is on.
postgres=# select * from plpgsql_check_function_tb('f1()');
─[ RECORD 1 ]───────────────────────────
functionid │ f1
lineno │ 6
statement │ RAISE
sqlstate │ 42703
message │ record "r" has no field "c"
detail │ [null]
hint │ [null]
level │ error
position │ 0
query │ [null]
postgres=# \sf+ f1
CREATE OR REPLACE FUNCTION public.f1()
RETURNS void
LANGUAGE plpgsql
1 AS $function$
2 DECLARE r record;
3 BEGIN
4 FOR r IN SELECT * FROM t1
5 LOOP
6 RAISE NOTICE '%', r.c; -- there is bug - table t1 missing "c" column
7 END LOOP;
8 END;
9 $function$
Function plpgsql_check_function() has three possible formats: text, json or xml
select * from plpgsql_check_function('f1()', fatal_errors := false);
plpgsql_check_function
------------------------------------------------------------------------
error:42703:4:SQL statement:column "c" of relation "t1" does not exist
Query: update t1 set c = 30
-- ^
error:42P01:7:RAISE:missing FROM-clause entry for table "r"
Query: SELECT r.c
-- ^
error:42601:7:RAISE:too few parameters specified for RAISE
(7 rows)
postgres=# select * from plpgsql_check_function('fx()', format:='xml');
plpgsql_check_function
────────────────────────────────────────────────────────────────
<Function oid="16400"> ↵
<Issue> ↵
<Level>error</level> ↵
<Sqlstate>42P01</Sqlstate> ↵
<Message>relation "foo111" does not exist</Message> ↵
<Stmt lineno="3">RETURN</Stmt> ↵
<Query position="23">SELECT (select a from foo111)</Query>↵
</Issue> ↵
</Function>
(1 row)
You can set level of warnings via function's parameters:
'fx()'::regprocedure
or 16799::regprocedure
. Possible alternative is using a name only, when function's name is unique - like 'fx'
. When the name is not unique or the function doesn't exists it raises a error.relid DEFAULT 0
- oid of relation assigned with trigger function. It is necessary for check of any trigger function.
fatal_errors boolean DEFAULT true
- stop on first error
other_warnings boolean DEFAULT true
- show warnings like different attributes number in assignmenet on left and right side, variable overlaps function's parameter, unused variables, unwanted casting, ..
extra_warnings boolean DEFAULT true
- show warnings like missing RETURN
, shadowed variables, dead code, never read (unused) function's parameter, unmodified variables, modified auto variables, ..
performance_warnings boolean DEFAULT false
- performance related warnings like declared type with type modificator, casting, implicit casts in where clause (can be reason why index is not used), ..
security_warnings boolean DEFAULT false
- security related checks like SQL injection vulnerability detection
anyelementtype regtype DEFAULT 'int'
- a real type used instead anyelement type
anyenumtype regtype DEFAULT '-'
- a real type used instead anyenum type
anyrangetype regtype DEFAULT 'int4range'
- a real type used instead anyrange type
anycompatibletype DEFAULT 'int'
- a real type used instead anycompatible type
anycompatiblerangetype DEFAULT 'int4range'
- a real type used instead anycompatible range type
without_warnings DEFAULT false
- disable all warnings
all_warnings DEFAULT false
- enable all warnings
newtable DEFAULT NULL
, oldtable DEFAULT NULL
- the names of NEW or OLD transitive tables. These parameters are required when transitive tables are used.
When you want to check any trigger, you have to enter a relation that will be used together with trigger function
CREATE TABLE bar(a int, b int);
postgres=# \sf+ foo_trg
CREATE OR REPLACE FUNCTION public.foo_trg()
RETURNS trigger
LANGUAGE plpgsql
1 AS $function$
2 BEGIN
3 NEW.c := NEW.a + NEW.b;
4 RETURN NEW;
5 END;
6 $function$
Missing relation specification
postgres=# select * from plpgsql_check_function('foo_trg()');
ERROR: missing trigger relation
HINT: Trigger relation oid must be valid
Correct trigger checking (with specified relation)
postgres=# select * from plpgsql_check_function('foo_trg()', 'bar');
plpgsql_check_function
--------------------------------------------------------
error:42703:3:assignment:record "new" has no field "c"
(1 row)
For triggers with transitive tables you can set a oldtable
or newtable
parameters:
create or replace function footab_trig_func()
returns trigger as $$
declare x int;
begin
if false then
-- should be ok;
select count(*) from newtab into x;
-- should fail;
select count(*) from newtab where d = 10 into x;
end if;
return null;
end;
$$ language plpgsql;
select * from plpgsql_check_function('footab_trig_func','footab', newtable := 'newtab');
You can use the plpgsql_check_function for mass check functions and mass check triggers. Please, test following queries:
-- check all nontrigger plpgsql functions
SELECT p.oid, p.proname, plpgsql_check_function(p.oid)
FROM pg_catalog.pg_namespace n
JOIN pg_catalog.pg_proc p ON pronamespace = n.oid
JOIN pg_catalog.pg_language l ON p.prolang = l.oid
WHERE l.lanname = 'plpgsql' AND p.prorettype <> 2279;
or
SELECT p.proname, tgrelid::regclass, cf.*
FROM pg_proc p
JOIN pg_trigger t ON t.tgfoid = p.oid
JOIN pg_language l ON p.prolang = l.oid
JOIN pg_namespace n ON p.pronamespace = n.oid,
LATERAL plpgsql_check_function(p.oid, t.tgrelid) cf
WHERE n.nspname = 'public' and l.lanname = 'plpgsql'
or
-- check all plpgsql functions (functions or trigger functions with defined triggers)
SELECT
(pcf).functionid::regprocedure, (pcf).lineno, (pcf).statement,
(pcf).sqlstate, (pcf).message, (pcf).detail, (pcf).hint, (pcf).level,
(pcf)."position", (pcf).query, (pcf).context
FROM
(
SELECT
plpgsql_check_function_tb(pg_proc.oid, COALESCE(pg_trigger.tgrelid, 0)) AS pcf
FROM pg_proc
LEFT JOIN pg_trigger
ON (pg_trigger.tgfoid = pg_proc.oid)
WHERE
prolang = (SELECT lang.oid FROM pg_language lang WHERE lang.lanname = 'plpgsql') AND
pronamespace <> (SELECT nsp.oid FROM pg_namespace nsp WHERE nsp.nspname = 'pg_catalog') AND
-- ignore unused triggers
(pg_proc.prorettype <> (SELECT typ.oid FROM pg_type typ WHERE typ.typname = 'trigger') OR
pg_trigger.tgfoid IS NOT NULL)
OFFSET 0
) ss
ORDER BY (pcf).functionid::regprocedure::text, (pcf).lineno
Passive mode
Functions should be checked on start - plpgsql_check module must be loaded.
plpgsql_check.mode = [ disabled | by_function | fresh_start | every_start ]
plpgsql_check.fatal_errors = [ yes | no ]
plpgsql_check.show_nonperformance_warnings = false
plpgsql_check.show_performance_warnings = false
Default mode is by_function, that means that the enhanced check is done only in active mode - by plpgsql_check_function. fresh_start
means cold start.
You can enable passive mode by
load 'plpgsql'; -- 1.1 and higher doesn't need it
load 'plpgsql_check';
set plpgsql_check.mode = 'every_start';
SELECT fx(10); -- run functions - function is checked before runtime starts it
Limits
plpgsql_check should find almost all errors on really static code. When developer use some PLpgSQL's dynamic features like dynamic SQL or record data type, then false positives are possible. These should be rare - in well written code - and then the affected function should be redesigned or plpgsql_check should be disabled for this function.
CREATE OR REPLACE FUNCTION f1()
RETURNS void AS $$
DECLARE r record;
BEGIN
FOR r IN EXECUTE 'SELECT * FROM t1'
LOOP
RAISE NOTICE '%', r.c;
END LOOP;
END;
$$ LANGUAGE plpgsql SET plpgsql.enable_check TO false;
A usage of plpgsql_check adds a small overhead (in enabled passive mode) and you should use it only in develop or preprod environments.
This module doesn't check queries that are assembled in runtime. It is not possible to identify results of dynamic queries - so plpgsql_check cannot to set correct type to record variables and cannot to check a dependent SQLs and expressions.
When type of record's variable is not know, you can assign it explicitly with pragma type
:
DECLARE r record;
BEGIN
EXECUTE format('SELECT * FROM %I', _tablename) INTO r;
PERFORM plpgsql_check_pragma('type: r (id int, processed bool)');
IF NOT r.processed THEN
...
Attention: The SQL injection check can detect only some SQL injection vulnerabilities. This tool cannot be used for security audit! Some issues should not be detected. This check can raise false alarms too - probably when variable is sanitized by other command or when value is of some compose type.
plpgsql_check should not to detect structure of referenced cursors. A reference on cursor in PLpgSQL is implemented as name of global cursor. In check time, the name is not known (not in all possibilities), and global cursor doesn't exist. It is significant break for any static analyse. PLpgSQL cannot to set correct type for record variables and cannot to check a dependent SQLs and expressions. A solution is same like dynamic SQL. Don't use record variable as target when you use refcursor type or disable plpgsql_check for these functions.
CREATE OR REPLACE FUNCTION foo(refcur_var refcursor)
RETURNS void AS $$
DECLARE
rec_var record;
BEGIN
FETCH refcur_var INTO rec_var; -- this is STOP for plpgsql_check
RAISE NOTICE '%', rec_var; -- record rec_var is not assigned yet error
In this case a record type should not be used (use known rowtype instead):
CREATE OR REPLACE FUNCTION foo(refcur_var refcursor)
RETURNS void AS $$
DECLARE
rec_var some_rowtype;
BEGIN
FETCH refcur_var INTO rec_var;
RAISE NOTICE '%', rec_var;
plpgsql_check cannot verify queries over temporary tables that are created in plpgsql's function runtime. For this use case it is necessary to create a fake temp table or disable plpgsql_check for this function.
In reality temp tables are stored in own (per user) schema with higher priority than persistent tables. So you can do (with following trick safetly):
CREATE OR REPLACE FUNCTION public.disable_dml()
RETURNS trigger
LANGUAGE plpgsql AS $function$
BEGIN
RAISE EXCEPTION SQLSTATE '42P01'
USING message = format('this instance of %I table doesn''t allow any DML operation', TG_TABLE_NAME),
hint = format('you should to run "CREATE TEMP TABLE %1$I(LIKE %1$I INCLUDING ALL);" statement',
TG_TABLE_NAME);
RETURN NULL;
END;
$function$;
CREATE TABLE foo(a int, b int); -- doesn't hold data ever
CREATE TRIGGER foo_disable_dml
BEFORE INSERT OR UPDATE OR DELETE ON foo
EXECUTE PROCEDURE disable_dml();
postgres=# INSERT INTO foo VALUES(10,20);
ERROR: this instance of foo table doesn't allow any DML operation
HINT: you should to run "CREATE TEMP TABLE foo(LIKE foo INCLUDING ALL);" statement
postgres=#
CREATE TABLE
postgres=# INSERT INTO foo VALUES(10,20);
INSERT 0 1
This trick emulates GLOBAL TEMP tables partially and it allows a statical validation. Other possibility is using a [template foreign data wrapper] (https://github.com/okbob/template_fdw)
You can use pragma table
and create ephemeral table:
BEGIN
CREATE TEMP TABLE xxx(a int);
PERFORM plpgsql_check_pragma('table: xxx(a int)');
INSERT INTO xxx VALUES(10);
Dependency list
A function plpgsql_show_dependency_tb can show all functions, operators and relations used inside processed function:
postgres=# select * from plpgsql_show_dependency_tb('testfunc(int,float)');
┌──────────┬───────┬────────┬─────────┬────────────────────────────┐
│ type │ oid │ schema │ name │ params │
╞══════════╪═══════╪════════╪═════════╪════════════════════════════╡
│ FUNCTION │ 36008 │ public │ myfunc1 │ (integer,double precision) │
│ FUNCTION │ 35999 │ public │ myfunc2 │ (integer,double precision) │
│ OPERATOR │ 36007 │ public │ ** │ (integer,integer) │
│ RELATION │ 36005 │ public │ myview │ │
│ RELATION │ 36002 │ public │ mytable │ │
└──────────┴───────┴────────┴─────────┴────────────────────────────┘
(4 rows)
Profiler
The plpgsql_check contains simple profiler of plpgsql functions and procedures. It can work with/without a access to shared memory. It depends on shared_preload_libraries
config. When plpgsql_check was initialized by shared_preload_libraries
, then it can allocate shared memory, and function's profiles are stored there. When plpgsql_check cannot to allocate shared momory, the profile is stored in session memory.
Due dependencies, shared_preload_libraries
should to contains plpgsql
first
postgres=# show shared_preload_libraries ;
┌──────────────────────────┐
│ shared_preload_libraries │
╞══════════════════════════╡
│ plpgsql,plpgsql_check │
└──────────────────────────┘
(1 row)
The profiler is active when GUC plpgsql_check.profiler
is on. The profiler doesn't require shared memory, but if there are not shared memory, then the profile is limmitted just to active session.
When plpgsql_check is initialized by shared_preload_libraries
, another GUC is available to configure the amount of shared memory used by the profiler: plpgsql_check.profiler_max_shared_chunks
. This defines the maximum number of statements chunk that can be stored in shared memory. For each plpgsql function (or procedure), the whole content is split into chunks of 30 statements. If needed, multiple chunks can be used to store the whole content of a single function. A single chunk is 1704 bytes. The default value for this GUC is 15000, which should be enough for big projects containing hundred of thousands of statements in plpgsql, and will consume about 24MB of memory. If your project doesn't require that much number of chunks, you can set this parameter to a smaller number in order to decrease the memory usage. The minimum value is 50 (which should consume about 83kB of memory), and the maximum value is 100000 (which should consume about 163MB of memory). Changing this parameter requires a PostgreSQL restart.
The profiler will also retrieve the query identifier for each instruction that contains an expression or optimizable statement. Note that this requires pg_stat_statements, or another similar third-party extension), to be installed. There are some limitations to the query identifier retrieval:
Attention: A update of shared profiles can decrease performance on servers under higher load.
The profile can be displayed by function plpgsql_profiler_function_tb
:
postgres=# select lineno, avg_time, source from plpgsql_profiler_function_tb('fx(int)');
┌────────┬──────────┬───────────────────────────────────────────────────────────────────┐
│ lineno │ avg_time │ source │
╞════════╪══════════╪═══════════════════════════════════════════════════════════════════╡
│ 1 │ │ │
│ 2 │ │ declare result int = 0; │
│ 3 │ 0.075 │ begin │
│ 4 │ 0.202 │ for i in 1..$1 loop │
│ 5 │ 0.005 │ select result + i into result; select result + i into result; │
│ 6 │ │ end loop; │
│ 7 │ 0 │ return result; │
│ 8 │ │ end; │
└────────┴──────────┴───────────────────────────────────────────────────────────────────┘
(9 rows)
The profile per statements (not per line) can be displayed by function plpgsql_profiler_function_statements_tb:
CREATE OR REPLACE FUNCTION public.fx1(a integer)
RETURNS integer
LANGUAGE plpgsql
1 AS $function$
2 begin
3 if a > 10 then
4 raise notice 'ahoj';
5 return -1;
6 else
7 raise notice 'nazdar';
8 return 1;
9 end if;
10 end;
11 $function$
postgres=# select stmtid, parent_stmtid, parent_note, lineno, exec_stmts, stmtname
from plpgsql_profiler_function_statements_tb('fx1');
┌────────┬───────────────┬─────────────┬────────┬────────────┬─────────────────┐
│ stmtid │ parent_stmtid │ parent_note │ lineno │ exec_stmts │ stmtname │
╞════════╪═══════════════╪═════════════╪════════╪════════════╪═════════════════╡
│ 0 │ ∅ │ ∅ │ 2 │ 0 │ statement block │
│ 1 │ 0 │ body │ 3 │ 0 │ IF │
│ 2 │ 1 │ then body │ 4 │ 0 │ RAISE │
│ 3 │ 1 │ then body │ 5 │ 0 │ RETURN │
│ 4 │ 1 │ else body │ 7 │ 0 │ RAISE │
│ 5 │ 1 │ else body │ 8 │ 0 │ RETURN │
└────────┴───────────────┴─────────────┴────────┴────────────┴─────────────────┘
(6 rows)
All stored profiles can be displayed by calling function plpgsql_profiler_functions_all
:
postgres=# select * from plpgsql_profiler_functions_all();
┌───────────────────────┬────────────┬────────────┬──────────┬─────────────┬──────────┬──────────┐
│ funcoid │ exec_count │ total_time │ avg_time │ stddev_time │ min_time │ max_time │
╞═══════════════════════╪════════════╪════════════╪══════════╪═════════════╪══════════╪══════════╡
│ fxx(double precision) │ 1 │ 0.01 │ 0.01 │ 0.00 │ 0.01 │ 0.01 │
└───────────────────────┴────────────┴────────────┴──────────┴─────────────┴──────────┴──────────┘
(1 row)
There are two functions for cleaning stored profiles: plpgsql_profiler_reset_all()
and plpgsql_profiler_reset(regprocedure)
.
plpgsql_check provides two functions:
plpgsql_coverage_statements(name)
plpgsql_coverage_branches(name)
There is another very good PLpgSQL profiler - https://bitbucket.org/openscg/plprofiler
My extension is designed to be simple for use and practical. Nothing more or less.
plprofiler is more complex. It build call graphs and from this graph it can creates flame graph of execution times.
Both extensions can be used together with buildin PostgreSQL's feature - tracking functions.
set track_functions to 'pl';
...
select * from pg_stat_user_functions;
Tracer
plpgsql_check provides a tracing possibility - in this mode you can see notices on start or end functions (terse and default verbosity) and start or end statements (verbose verbosity). For default and verbose verbosity the content of function arguments is displayed. The content of related variables are displayed when verbosity is verbose.
postgres=# do $$ begin perform fx(10,null, 'now', e'stěhule'); end; $$;
NOTICE: #0 ->> start of inline_code_block (Oid=0)
NOTICE: #2 ->> start of function fx(integer,integer,date,text) (Oid=16405)
NOTICE: #2 call by inline_code_block line 1 at PERFORM
NOTICE: #2 "a" => '10', "b" => null, "c" => '2020-08-03', "d" => 'stěhule'
NOTICE: #4 ->> start of function fx(integer) (Oid=16404)
NOTICE: #4 call by fx(integer,integer,date,text) line 1 at PERFORM
NOTICE: #4 "a" => '10'
NOTICE: #4 <<- end of function fx (elapsed time=0.098 ms)
NOTICE: #2 <<- end of function fx (elapsed time=0.399 ms)
NOTICE: #0 <<- end of block (elapsed time=0.754 ms)
The number after #
is a execution frame counter (this number is related to deep of error context stack). It allows to pair start end and of function.
Tracing is enabled by setting plpgsql_check.tracer
to on
. Attention - enabling this behaviour has significant negative impact on performance (unlike the profiler). You can set a level for output used by tracer plpgsql_check.tracer_errlevel
(default is notice
). The output content is limited by length specified by plpgsql_check.tracer_variable_max_length
configuration variable.
In terse verbose mode the output is reduced:
postgres=# set plpgsql_check.tracer_verbosity TO terse;
SET
postgres=# do $$ begin perform fx(10,null, 'now', e'stěhule'); end; $$;
NOTICE: #0 start of inline code block (oid=0)
NOTICE: #2 start of fx (oid=16405)
NOTICE: #4 start of fx (oid=16404)
NOTICE: #4 end of fx
NOTICE: #2 end of fx
NOTICE: #0 end of inline code block
In verbose mode the output is extended about statement details:
postgres=# do $$ begin perform fx(10,null, 'now', e'stěhule'); end; $$;
NOTICE: #0 ->> start of block inline_code_block (oid=0)
NOTICE: #0.1 1 --> start of PERFORM
NOTICE: #2 ->> start of function fx(integer,integer,date,text) (oid=16405)
NOTICE: #2 call by inline_code_block line 1 at PERFORM
NOTICE: #2 "a" => '10', "b" => null, "c" => '2020-08-04', "d" => 'stěhule'
NOTICE: #2.1 1 --> start of PERFORM
NOTICE: #2.1 "a" => '10'
NOTICE: #4 ->> start of function fx(integer) (oid=16404)
NOTICE: #4 call by fx(integer,integer,date,text) line 1 at PERFORM
NOTICE: #4 "a" => '10'
NOTICE: #4.1 6 --> start of assignment
NOTICE: #4.1 "a" => '10', "b" => '20'
NOTICE: #4.1 <-- end of assignment (elapsed time=0.076 ms)
NOTICE: #4.1 "res" => '130'
NOTICE: #4.2 7 --> start of RETURN
NOTICE: #4.2 "res" => '130'
NOTICE: #4.2 <-- end of RETURN (elapsed time=0.054 ms)
NOTICE: #4 <<- end of function fx (elapsed time=0.373 ms)
NOTICE: #2.1 <-- end of PERFORM (elapsed time=0.589 ms)
NOTICE: #2 <<- end of function fx (elapsed time=0.727 ms)
NOTICE: #0.1 <-- end of PERFORM (elapsed time=1.147 ms)
NOTICE: #0 <<- end of block (elapsed time=1.286 ms)
Special feature of tracer is tracing of ASSERT
statement when plpgsql_check.trace_assert
is on
. When plpgsql_check.trace_assert_verbosity
is DEFAULT
, then all function's or procedure's variables are displayed when assert expression is false. When this configuration is VERBOSE
then all variables from all plpgsql frames are displayed. This behaviour is independent on plpgsql.check_asserts
value. It can be used, although the assertions are disabled in plpgsql runtime.
postgres=# set plpgsql_check.tracer to off;
postgres=# set plpgsql_check.trace_assert_verbosity TO verbose;
postgres=# do $$ begin perform fx(10,null, 'now', e'stěhule'); end; $$;
NOTICE: #4 PLpgSQL assert expression (false) on line 12 of fx(integer) is false
NOTICE: "a" => '10', "res" => null, "b" => '20'
NOTICE: #2 PL/pgSQL function fx(integer,integer,date,text) line 1 at PERFORM
NOTICE: "a" => '10', "b" => null, "c" => '2020-08-05', "d" => 'stěhule'
NOTICE: #0 PL/pgSQL function inline_code_block line 1 at PERFORM
ERROR: assertion failed
CONTEXT: PL/pgSQL function fx(integer) line 12 at ASSERT
SQL statement "SELECT fx(a)"
PL/pgSQL function fx(integer,integer,date,text) line 1 at PERFORM
SQL statement "SELECT fx(10,null, 'now', e'stěhule')"
PL/pgSQL function inline_code_block line 1 at PERFORM
postgres=# set plpgsql.check_asserts to off;
SET
postgres=# do $$ begin perform fx(10,null, 'now', e'stěhule'); end; $$;
NOTICE: #4 PLpgSQL assert expression (false) on line 12 of fx(integer) is false
NOTICE: "a" => '10', "res" => null, "b" => '20'
NOTICE: #2 PL/pgSQL function fx(integer,integer,date,text) line 1 at PERFORM
NOTICE: "a" => '10', "b" => null, "c" => '2020-08-05', "d" => 'stěhule'
NOTICE: #0 PL/pgSQL function inline_code_block line 1 at PERFORM
DO
Tracer prints content of variables or function arguments. For security definer function, this content can hold security sensitive data. This is reason why tracer is disabled by default and should be enabled only with super user rights plpgsql_check.enable_tracer
.
Pragma
You can configure plpgsql_check behave inside checked function with "pragma" function. This is a analogy of PL/SQL or ADA language of PRAGMA feature. PLpgSQL doesn't support PRAGMA, but plpgsql_check detects function named plpgsql_check_pragma
and get options from parameters of this function. These plpgsql_check options are valid to end of group of statements.
CREATE OR REPLACE FUNCTION test()
RETURNS void AS $$
BEGIN
...
-- for following statements disable check
PERFORM plpgsql_check_pragma('disable:check');
...
-- enable check again
PERFORM plpgsql_check_pragma('enable:check');
...
END;
$$ LANGUAGE plpgsql;
The function plpgsql_check_pragma
is immutable function that returns one. It is defined by plpgsql_check
extension. You can declare alternative plpgsql_check_pragma
function like:
CREATE OR REPLACE FUNCTION plpgsql_check_pragma(VARIADIC args[])
RETURNS int AS $$
SELECT 1
$$ LANGUAGE sql IMMUTABLE;
Using pragma function in declaration part of top block sets options on function level too.
CREATE OR REPLACE FUNCTION test()
RETURNS void AS $$
DECLARE
aux int := plpgsql_check_pragma('disable:extra_warnings');
...
Shorter syntax for pragma is supported too:
CREATE OR REPLACE FUNCTION test()
RETURNS void AS $$
DECLARE r record;
BEGIN
PERFORM 'PRAGMA:TYPE:r (a int, b int)';
PERFORM 'PRAGMA:TABLE: x (like pg_class)';
...
echo:str
- print string (for testing)
status:check
,status:tracer
, status:other_warnings
, status:performance_warnings
, status:extra_warnings
,status:security_warnings
enable:check
,enable:tracer
, enable:other_warnings
, enable:performance_warnings
, enable:extra_warnings
,enable:security_warnings
disable:check
,disable:tracer
, disable:other_warnings
, disable:performance_warnings
, disable:extra_warnings
,disable:security_warnings
type:varname typename
or type:varname (fieldname type, ...)
- set type to variable of record type
table: name (column_name type, ...)
or table: name (like tablename)
- create ephereal table
Pragmas enable:tracer
and disable:tracer
are active for Postgres 12 and higher
Compilation
You need a development environment for PostgreSQL extensions:
make clean
make install
result:
[pavel@localhost plpgsql_check]$ make USE_PGXS=1 clean
rm -f plpgsql_check.so libplpgsql_check.a libplpgsql_check.pc
rm -f plpgsql_check.o
rm -rf results/ regression.diffs regression.out tmp_check/ log/
[pavel@localhost plpgsql_check]$ make USE_PGXS=1 all
clang -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fpic -I/usr/local/pgsql/lib/pgxs/src/makefiles/../../src/pl/plpgsql/src -I. -I./ -I/usr/local/pgsql/include/server -I/usr/local/pgsql/include/internal -D_GNU_SOURCE -c -o plpgsql_check.o plpgsql_check.c
clang -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fpic -I/usr/local/pgsql/lib/pgxs/src/makefiles/../../src/pl/plpgsql/src -shared -o plpgsql_check.so plpgsql_check.o -L/usr/local/pgsql/lib -Wl,--as-needed -Wl,-rpath,'/usr/local/pgsql/lib',--enable-new-dtags
[pavel@localhost plpgsql_check]$ su root
Password: *******
[root@localhost plpgsql_check]# make USE_PGXS=1 install
/usr/bin/mkdir -p '/usr/local/pgsql/lib'
/usr/bin/mkdir -p '/usr/local/pgsql/share/extension'
/usr/bin/mkdir -p '/usr/local/pgsql/share/extension'
/usr/bin/install -c -m 755 plpgsql_check.so '/usr/local/pgsql/lib/plpgsql_check.so'
/usr/bin/install -c -m 644 plpgsql_check.control '/usr/local/pgsql/share/extension/'
/usr/bin/install -c -m 644 plpgsql_check--0.9.sql '/usr/local/pgsql/share/extension/'
[root@localhost plpgsql_check]# exit
[pavel@localhost plpgsql_check]$ make USE_PGXS=1 installcheck
/usr/local/pgsql/lib/pgxs/src/makefiles/../../src/test/regress/pg_regress --inputdir=./ --psqldir='/usr/local/pgsql/bin' --dbname=pl_regression --load-language=plpgsql --dbname=contrib_regression plpgsql_check_passive plpgsql_check_active plpgsql_check_active-9.5
(using postmaster on Unix socket, default port)
============== dropping database "contrib_regression" ==============
DROP DATABASE
============== creating database "contrib_regression" ==============
CREATE DATABASE
ALTER DATABASE
============== installing plpgsql ==============
CREATE LANGUAGE
============== running regression test queries ==============
test plpgsql_check_passive ... ok
test plpgsql_check_active ... ok
test plpgsql_check_active-9.5 ... ok
=====================
All 3 tests passed.
=====================
Sometimes successful compilation can require libicu-dev package (PostgreSQL 10 and higher - when pg was compiled with ICU support)
sudo apt install libicu-dev
You can check precompiled dll libraries http://okbob.blogspot.cz/2015/02/plpgsqlcheck-is-available-for-microsoft.html
or compile by self:
plpgsql_check.dll
to PostgreSQL\14\lib
plpgsql_check.control
and plpgsql_check--2.1.sql
to PostgreSQL\14\share\extension
Compilation against PostgreSQL 10 requires libICU!
Licence
Copyright (c) Pavel Stehule (pavel.stehule@gmail.com)
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Note
If you like it, send a postcard to address
Pavel Stehule
Skalice 12
256 01 Benesov u Prahy
Czech Republic
I invite any questions, comments, bug reports, patches on mail address pavel.stehule@gmail.com
Author: okbob
Source Code: https://github.com/okbob/plpgsql_check
License: View license