Reach and Limits of the Supermassive Model GPT-3

About This blog post

This blog post provides an explanation of GPT-3 [1]. The summary of the content is as follows.

  • In GPT-2, the predecessor of GPT-3, the authors built a language model with a huge dataset and a huge network, and they got good results without having to train it in each task.
  • In GPT-3, the authors built a language model with an even bigger dataset + an even bigger network, and got great results when the model see dozens of samples.
  • On the other hand, the limitations of scaling up the language model alone for various tasks are becoming apparent.
  • There are also issues of bias towards race, gender, and religion, as well as challenges against willful misuse.

This article goes as follows.

1. Description of the Transformer, GPT-2

2. Concept and Technical Description of GPT-3

3. Tasks that work well using GPT-3

4. Tasks that do not work well using GPT-3

5 . Views on bias and misuse


Transformer

Before we get to GPT-2, the predecessor of GPT-3, let’s take a look at the Transformer Encoder[2] that was used in it. The proposed model, with a provocative title for those who loved LSTM/CNN, also made a splash.

It’s a mechanism called dot product Attention, which is neither CNN nor LSTM, and it’s a model that stacks them together (Transformer), and it far outperforms existing methods.

Image for post

Taken from [2], an overview of the Transformer model

There are three variables, Query, Key, and Value, in (dot-product) Attention that are used in the Transformer. Simply put, the system calculates the Attention Weight of the Query and Key words, and multiplies each Key by the Value associated with it.

Image for post

dot product attention

Multi-Head Attention, which uses multiple Attention Heads (in term of MLP, the “number of hidden layers” is increased), is defined as follows. The (Single Head) Attention in the above figure uses Q,K as it is, but in the Multi-Head Attention, each Head has its own projection matrix W_i^Q, W_i^K, and W_i^V, and the features projected by these matrices are used to create the Attention.

The Q, K, and V used in this dot product Attention are called Self-Attention when they are all derived from the same data. This is called Self-Attention when the Q, K, and V used in this dot product Attention are derived from the same data. In Transformer, the part of the Encoder and the part below the Decoder are Self-Attention. The upper part of the Decoder is not Self-Attention because the Query is brought from Encoder and the K,V are brought from Decoder. The following figure shows the example of attention weights. In this figure, the word “making” is used as a query and the Attention Weight for each key word is calculated and visualized. The each keys in attention heads are learning different dependencies. the words of “key” are colored in multiple ways to represent the Attention Weight of each head.

Image for post


GPT-2

GPT-2 is a research on autoregressive language models using huge datasets and highly expressive language models, and the language models are used as they are to solve various tasks (zero-shot). It consists of three elements: “zero-shot by autoregressive models”, “large scale models” and “a large dataset”.


autoregressive language model and zero-shot

An autoregressive language model refers to a model that defines the probability of occurrence of the next word using the words that have appeared before.

Image for post

Conceptual diagram of an autoregressive language model using RNN. The probability of occurrence of the nth token (word) s_n in sequence x is defined by the conditional probability of s_1~ s_(n-1).

The “zero-shot” method used in GPT-2 is use a language model’s next word predictions as answer of a task. For example, if you want to translate the word “cheese” from English to French as shown in the figure below, you can use s_1~s_(n-1) as “Translate English to French: cheese =>”, and s_n as “fromage” which means cheese in French as output.

Image for post

translation using zero-shot

Highly expressive large scale models

In GPT-2, instead of RNN, the authors use the aforementioned Transformer’s Encoder block (Multi-head Attention + Feed Forward = Transformer Encoder), which is highly expressive, to construct an autoregressive language model. However, instead of using it as it is, they changed the initialization and the network structure slightly. The network structure has been changed from the original Transformer Encoder (on the left figure) to the Transformer Encoder (on the right figure), as shown in the following figure.

Image for post

Quoted from [4], where the Pre-LN Transformer structure is used in GPT-2

The benefits of this structure are discussed in detail in the paper “On Layer Normalization in the Transformer Architecture”[4]. In the original structure, the transformer must use a technique called “warm up”, which allows a small learning rate in the early stages of training (from a few hundred to a few thousand steps) to be used in order for the transformer to learn well (see left-hand side below). On the other hand, the Pre-LN Transformer structure used in GPT-2 ( the name of “Pre-LN” is not used in the GPT-2 paper because the GPT-2 paper was published earlier) limits the size of the gradient by the depth of the layers (L), which makes it possible to train the transformer without warming up and increases the accuracy (see Figure below, right). By the way, if you use Rectified Adam (RAdam), which is an Adam that does not need warm up, you can learn without warm up.

Image for post

Quoted from [4]. (Left) Accuracy and accuracy with and without warm up in Post-LN, the original structure. (right) Pre-LN, the structure adopted by GPT-2, when it is trained without warming up.

GPT-2 proposes four different sized models, and uses the modified Transformer Encoder mentioned above, up to 48 layers. Considering that the original Transformer has only 12 layers (6 layers for Encoder and 6 layers for Decoder), you can see that this is a very large model. In terms of the number of parameters, the largest model has 1.5 billion, which is about 25 times larger than ResNet152 (about 60 million parameters), which is often used for single-image tasks.

Image for post

Quoted from [3]. Four different sizes are available.

A large dataset

They are using a large dataset called WebText for GPT-2 training. As the name implies, this is a set of text data fetched from the web, and it’s a huge dataset of 40 GB in total, containing 8 million documents.

As an aside, on image tasks, there is a trend to achieve high accuracy using large datasets and large models too. And a famous one is BiT [7]. In this study, a huge pre-trained model trained with a huge dataset showed high performance in fine-tuning. The model has 4 times the number of ResNet152 channels and is trained with 300 million data JFT-300M for several months. The results show that fine-tuning can be done with a small amount of data and optimization is fast and efficient.

Image for post

Quoted from [7]

GPT-2 Results

The table below shows the results by GPT-2, which consists of three components: “zero-shot by autoregressive models”, “large scale models” and “a large dataset”. It has updated SOTA with various datasets. Note that GPT-2 did not use fine-tuning or even few-shot learning on each dataset, showing that, as the title of the GPT-2 paper suggests, “Language Models are Unsupervised Multitask Learners”.

#gpt-3 #nlp #deep-learning #deep learning

What is GEEK

Buddha Community

Reach and Limits of the Supermassive Model GPT-3

A Wrapper for Sembast and SQFlite to Enable Easy

FHIR_DB

This is really just a wrapper around Sembast_SQFLite - so all of the heavy lifting was done by Alex Tekartik. I highly recommend that if you have any questions about working with this package that you take a look at Sembast. He's also just a super nice guy, and even answered a question for me when I was deciding which sembast version to use. As usual, ResoCoder also has a good tutorial.

I have an interest in low-resource settings and thus a specific reason to be able to store data offline. To encourage this use, there are a number of other packages I have created based around the data format FHIR. FHIR® is the registered trademark of HL7 and is used with the permission of HL7. Use of the FHIR trademark does not constitute endorsement of this product by HL7.

Using the Db

So, while not absolutely necessary, I highly recommend that you use some sort of interface class. This adds the benefit of more easily handling errors, plus if you change to a different database in the future, you don't have to change the rest of your app, just the interface.

I've used something like this in my projects:

class IFhirDb {
  IFhirDb();
  final ResourceDao resourceDao = ResourceDao();

  Future<Either<DbFailure, Resource>> save(Resource resource) async {
    Resource resultResource;
    try {
      resultResource = await resourceDao.save(resource);
    } catch (error) {
      return left(DbFailure.unableToSave(error: error.toString()));
    }
    return right(resultResource);
  }

  Future<Either<DbFailure, List<Resource>>> returnListOfSingleResourceType(
      String resourceType) async {
    List<Resource> resultList;
    try {
      resultList =
          await resourceDao.getAllSortedById(resourceType: resourceType);
    } catch (error) {
      return left(DbFailure.unableToObtainList(error: error.toString()));
    }
    return right(resultList);
  }

  Future<Either<DbFailure, List<Resource>>> searchFunction(
      String resourceType, String searchString, String reference) async {
    List<Resource> resultList;
    try {
      resultList =
          await resourceDao.searchFor(resourceType, searchString, reference);
    } catch (error) {
      return left(DbFailure.unableToObtainList(error: error.toString()));
    }
    return right(resultList);
  }
}

I like this because in case there's an i/o error or something, it won't crash your app. Then, you can call this interface in your app like the following:

final patient = Patient(
    resourceType: 'Patient',
    name: [HumanName(text: 'New Patient Name')],
    birthDate: Date(DateTime.now()),
);

final saveResult = await IFhirDb().save(patient);

This will save your newly created patient to the locally embedded database.

IMPORTANT: this database will expect that all previously created resources have an id. When you save a resource, it will check to see if that resource type has already been stored. (Each resource type is saved in it's own store in the database). It will then check if there is an ID. If there's no ID, it will create a new one for that resource (along with metadata on version number and creation time). It will save it, and return the resource. If it already has an ID, it will copy the the old version of the resource into a _history store. It will then update the metadata of the new resource and save that version into the appropriate store for that resource. If, for instance, we have a previously created patient:

{
    "resourceType": "Patient",
    "id": "fhirfli-294057507-6811107",
    "meta": {
        "versionId": "1",
        "lastUpdated": "2020-10-16T19:41:28.054369Z"
    },
    "name": [
        {
            "given": ["New"],
            "family": "Patient"
        }
    ],
    "birthDate": "2020-10-16"
}

And we update the last name to 'Provider'. The above version of the patient will be kept in _history, while in the 'Patient' store in the db, we will have the updated version:

{
    "resourceType": "Patient",
    "id": "fhirfli-294057507-6811107",
    "meta": {
        "versionId": "2",
        "lastUpdated": "2020-10-16T19:45:07.316698Z"
    },
    "name": [
        {
            "given": ["New"],
            "family": "Provider"
        }
    ],
    "birthDate": "2020-10-16"
}

This way we can keep track of all previous version of all resources (which is obviously important in medicine).

For most of the interactions (saving, deleting, etc), they work the way you'd expect. The only difference is search. Because Sembast is NoSQL, we can search on any of the fields in a resource. If in our interface class, we have the following function:

  Future<Either<DbFailure, List<Resource>>> searchFunction(
      String resourceType, String searchString, String reference) async {
    List<Resource> resultList;
    try {
      resultList =
          await resourceDao.searchFor(resourceType, searchString, reference);
    } catch (error) {
      return left(DbFailure.unableToObtainList(error: error.toString()));
    }
    return right(resultList);
  }

You can search for all immunizations of a certain patient:

searchFunction(
        'Immunization', 'patient.reference', 'Patient/$patientId');

This function will search through all entries in the 'Immunization' store. It will look at all 'patient.reference' fields, and return any that match 'Patient/$patientId'.

The last thing I'll mention is that this is a password protected db, using AES-256 encryption (although it can also use Salsa20). Anytime you use the db, you have the option of using a password for encryption/decryption. Remember, if you setup the database using encryption, you will only be able to access it using that same password. When you're ready to change the password, you will need to call the update password function. If we again assume we created a change password method in our interface, it might look something like this:

class IFhirDb {
  IFhirDb();
  final ResourceDao resourceDao = ResourceDao();
  ...
    Future<Either<DbFailure, Unit>> updatePassword(String oldPassword, String newPassword) async {
    try {
      await resourceDao.updatePw(oldPassword, newPassword);
    } catch (error) {
      return left(DbFailure.unableToUpdatePassword(error: error.toString()));
    }
    return right(Unit);
  }

You don't have to use a password, and in that case, it will save the db file as plain text. If you want to add a password later, it will encrypt it at that time.

General Store

After using this for a while in an app, I've realized that it needs to be able to store data apart from just FHIR resources, at least on occasion. For this, I've added a second class for all versions of the database called GeneralDao. This is similar to the ResourceDao, but fewer options. So, in order to save something, it would look like this:

await GeneralDao().save('password', {'new':'map'});
await GeneralDao().save('password', {'new':'map'}, 'key');

The difference between these two options is that the first one will generate a key for the map being stored, while the second will store the map using the key provided. Both will return the key after successfully storing the map.

Other functions available include:

// deletes everything in the general store
await GeneralDao().deleteAllGeneral('password'); 

// delete specific entry
await GeneralDao().delete('password','key'); 

// returns map with that key
await GeneralDao().find('password', 'key'); 

FHIR® is a registered trademark of Health Level Seven International (HL7) and its use does not constitute an endorsement of products by HL7®

Use this package as a library

Depend on it

Run this command:

With Flutter:

 $ flutter pub add fhir_db

This will add a line like this to your package's pubspec.yaml (and run an implicit flutter pub get):

dependencies:
  fhir_db: ^0.4.3

Alternatively, your editor might support or flutter pub get. Check the docs for your editor to learn more.

Import it

Now in your Dart code, you can use:

import 'package:fhir_db/dstu2.dart';
import 'package:fhir_db/dstu2/fhir_db.dart';
import 'package:fhir_db/dstu2/general_dao.dart';
import 'package:fhir_db/dstu2/resource_dao.dart';
import 'package:fhir_db/encrypt/aes.dart';
import 'package:fhir_db/encrypt/salsa.dart';
import 'package:fhir_db/r4.dart';
import 'package:fhir_db/r4/fhir_db.dart';
import 'package:fhir_db/r4/general_dao.dart';
import 'package:fhir_db/r4/resource_dao.dart';
import 'package:fhir_db/r5.dart';
import 'package:fhir_db/r5/fhir_db.dart';
import 'package:fhir_db/r5/general_dao.dart';
import 'package:fhir_db/r5/resource_dao.dart';
import 'package:fhir_db/stu3.dart';
import 'package:fhir_db/stu3/fhir_db.dart';
import 'package:fhir_db/stu3/general_dao.dart';
import 'package:fhir_db/stu3/resource_dao.dart'; 

example/lib/main.dart

import 'package:fhir/r4.dart';
import 'package:fhir_db/r4.dart';
import 'package:flutter/material.dart';
import 'package:test/test.dart';

Future<void> main() async {
  WidgetsFlutterBinding.ensureInitialized();

  final resourceDao = ResourceDao();

  // await resourceDao.updatePw('newPw', null);
  await resourceDao.deleteAllResources(null);

  group('Playing with passwords', () {
    test('Playing with Passwords', () async {
      final patient = Patient(id: Id('1'));

      final saved = await resourceDao.save(null, patient);

      await resourceDao.updatePw(null, 'newPw');
      final search1 = await resourceDao.find('newPw',
          resourceType: R4ResourceType.Patient, id: Id('1'));
      expect(saved, search1[0]);

      await resourceDao.updatePw('newPw', 'newerPw');
      final search2 = await resourceDao.find('newerPw',
          resourceType: R4ResourceType.Patient, id: Id('1'));
      expect(saved, search2[0]);

      await resourceDao.updatePw('newerPw', null);
      final search3 = await resourceDao.find(null,
          resourceType: R4ResourceType.Patient, id: Id('1'));
      expect(saved, search3[0]);

      await resourceDao.deleteAllResources(null);
    });
  });

  final id = Id('12345');
  group('Saving Things:', () {
    test('Save Patient', () async {
      final humanName = HumanName(family: 'Atreides', given: ['Duke']);
      final patient = Patient(id: id, name: [humanName]);
      final saved = await resourceDao.save(null, patient);

      expect(saved.id, id);

      expect((saved as Patient).name?[0], humanName);
    });

    test('Save Organization', () async {
      final organization = Organization(id: id, name: 'FhirFli');
      final saved = await resourceDao.save(null, organization);

      expect(saved.id, id);

      expect((saved as Organization).name, 'FhirFli');
    });

    test('Save Observation1', () async {
      final observation1 = Observation(
        id: Id('obs1'),
        code: CodeableConcept(text: 'Observation #1'),
        effectiveDateTime: FhirDateTime(DateTime(1981, 09, 18)),
      );
      final saved = await resourceDao.save(null, observation1);

      expect(saved.id, Id('obs1'));

      expect((saved as Observation).code.text, 'Observation #1');
    });

    test('Save Observation1 Again', () async {
      final observation1 = Observation(
          id: Id('obs1'),
          code: CodeableConcept(text: 'Observation #1 - Updated'));
      final saved = await resourceDao.save(null, observation1);

      expect(saved.id, Id('obs1'));

      expect((saved as Observation).code.text, 'Observation #1 - Updated');

      expect(saved.meta?.versionId, Id('2'));
    });

    test('Save Observation2', () async {
      final observation2 = Observation(
        id: Id('obs2'),
        code: CodeableConcept(text: 'Observation #2'),
        effectiveDateTime: FhirDateTime(DateTime(1981, 09, 18)),
      );
      final saved = await resourceDao.save(null, observation2);

      expect(saved.id, Id('obs2'));

      expect((saved as Observation).code.text, 'Observation #2');
    });

    test('Save Observation3', () async {
      final observation3 = Observation(
        id: Id('obs3'),
        code: CodeableConcept(text: 'Observation #3'),
        effectiveDateTime: FhirDateTime(DateTime(1981, 09, 18)),
      );
      final saved = await resourceDao.save(null, observation3);

      expect(saved.id, Id('obs3'));

      expect((saved as Observation).code.text, 'Observation #3');
    });
  });

  group('Finding Things:', () {
    test('Find 1st Patient', () async {
      final search = await resourceDao.find(null,
          resourceType: R4ResourceType.Patient, id: id);
      final humanName = HumanName(family: 'Atreides', given: ['Duke']);

      expect(search.length, 1);

      expect((search[0] as Patient).name?[0], humanName);
    });

    test('Find 3rd Observation', () async {
      final search = await resourceDao.find(null,
          resourceType: R4ResourceType.Observation, id: Id('obs3'));

      expect(search.length, 1);

      expect(search[0].id, Id('obs3'));

      expect((search[0] as Observation).code.text, 'Observation #3');
    });

    test('Find All Observations', () async {
      final search = await resourceDao.getResourceType(
        null,
        resourceTypes: [R4ResourceType.Observation],
      );

      expect(search.length, 3);

      final idList = [];
      for (final obs in search) {
        idList.add(obs.id.toString());
      }

      expect(idList.contains('obs1'), true);

      expect(idList.contains('obs2'), true);

      expect(idList.contains('obs3'), true);
    });

    test('Find All (non-historical) Resources', () async {
      final search = await resourceDao.getAll(null);

      expect(search.length, 5);
      final patList = search.toList();
      final orgList = search.toList();
      final obsList = search.toList();
      patList.retainWhere(
          (resource) => resource.resourceType == R4ResourceType.Patient);
      orgList.retainWhere(
          (resource) => resource.resourceType == R4ResourceType.Organization);
      obsList.retainWhere(
          (resource) => resource.resourceType == R4ResourceType.Observation);

      expect(patList.length, 1);

      expect(orgList.length, 1);

      expect(obsList.length, 3);
    });
  });

  group('Deleting Things:', () {
    test('Delete 2nd Observation', () async {
      await resourceDao.delete(
          null, null, R4ResourceType.Observation, Id('obs2'), null, null);

      final search = await resourceDao.getResourceType(
        null,
        resourceTypes: [R4ResourceType.Observation],
      );

      expect(search.length, 2);

      final idList = [];
      for (final obs in search) {
        idList.add(obs.id.toString());
      }

      expect(idList.contains('obs1'), true);

      expect(idList.contains('obs2'), false);

      expect(idList.contains('obs3'), true);
    });

    test('Delete All Observations', () async {
      await resourceDao.deleteSingleType(null,
          resourceType: R4ResourceType.Observation);

      final search = await resourceDao.getAll(null);

      expect(search.length, 2);

      final patList = search.toList();
      final orgList = search.toList();
      patList.retainWhere(
          (resource) => resource.resourceType == R4ResourceType.Patient);
      orgList.retainWhere(
          (resource) => resource.resourceType == R4ResourceType.Organization);

      expect(patList.length, 1);

      expect(patList.length, 1);
    });

    test('Delete All Resources', () async {
      await resourceDao.deleteAllResources(null);

      final search = await resourceDao.getAll(null);

      expect(search.length, 0);
    });
  });

  group('Password - Saving Things:', () {
    test('Save Patient', () async {
      await resourceDao.updatePw(null, 'newPw');
      final humanName = HumanName(family: 'Atreides', given: ['Duke']);
      final patient = Patient(id: id, name: [humanName]);
      final saved = await resourceDao.save('newPw', patient);

      expect(saved.id, id);

      expect((saved as Patient).name?[0], humanName);
    });

    test('Save Organization', () async {
      final organization = Organization(id: id, name: 'FhirFli');
      final saved = await resourceDao.save('newPw', organization);

      expect(saved.id, id);

      expect((saved as Organization).name, 'FhirFli');
    });

    test('Save Observation1', () async {
      final observation1 = Observation(
        id: Id('obs1'),
        code: CodeableConcept(text: 'Observation #1'),
        effectiveDateTime: FhirDateTime(DateTime(1981, 09, 18)),
      );
      final saved = await resourceDao.save('newPw', observation1);

      expect(saved.id, Id('obs1'));

      expect((saved as Observation).code.text, 'Observation #1');
    });

    test('Save Observation1 Again', () async {
      final observation1 = Observation(
          id: Id('obs1'),
          code: CodeableConcept(text: 'Observation #1 - Updated'));
      final saved = await resourceDao.save('newPw', observation1);

      expect(saved.id, Id('obs1'));

      expect((saved as Observation).code.text, 'Observation #1 - Updated');

      expect(saved.meta?.versionId, Id('2'));
    });

    test('Save Observation2', () async {
      final observation2 = Observation(
        id: Id('obs2'),
        code: CodeableConcept(text: 'Observation #2'),
        effectiveDateTime: FhirDateTime(DateTime(1981, 09, 18)),
      );
      final saved = await resourceDao.save('newPw', observation2);

      expect(saved.id, Id('obs2'));

      expect((saved as Observation).code.text, 'Observation #2');
    });

    test('Save Observation3', () async {
      final observation3 = Observation(
        id: Id('obs3'),
        code: CodeableConcept(text: 'Observation #3'),
        effectiveDateTime: FhirDateTime(DateTime(1981, 09, 18)),
      );
      final saved = await resourceDao.save('newPw', observation3);

      expect(saved.id, Id('obs3'));

      expect((saved as Observation).code.text, 'Observation #3');
    });
  });

  group('Password - Finding Things:', () {
    test('Find 1st Patient', () async {
      final search = await resourceDao.find('newPw',
          resourceType: R4ResourceType.Patient, id: id);
      final humanName = HumanName(family: 'Atreides', given: ['Duke']);

      expect(search.length, 1);

      expect((search[0] as Patient).name?[0], humanName);
    });

    test('Find 3rd Observation', () async {
      final search = await resourceDao.find('newPw',
          resourceType: R4ResourceType.Observation, id: Id('obs3'));

      expect(search.length, 1);

      expect(search[0].id, Id('obs3'));

      expect((search[0] as Observation).code.text, 'Observation #3');
    });

    test('Find All Observations', () async {
      final search = await resourceDao.getResourceType(
        'newPw',
        resourceTypes: [R4ResourceType.Observation],
      );

      expect(search.length, 3);

      final idList = [];
      for (final obs in search) {
        idList.add(obs.id.toString());
      }

      expect(idList.contains('obs1'), true);

      expect(idList.contains('obs2'), true);

      expect(idList.contains('obs3'), true);
    });

    test('Find All (non-historical) Resources', () async {
      final search = await resourceDao.getAll('newPw');

      expect(search.length, 5);
      final patList = search.toList();
      final orgList = search.toList();
      final obsList = search.toList();
      patList.retainWhere(
          (resource) => resource.resourceType == R4ResourceType.Patient);
      orgList.retainWhere(
          (resource) => resource.resourceType == R4ResourceType.Organization);
      obsList.retainWhere(
          (resource) => resource.resourceType == R4ResourceType.Observation);

      expect(patList.length, 1);

      expect(orgList.length, 1);

      expect(obsList.length, 3);
    });
  });

  group('Password - Deleting Things:', () {
    test('Delete 2nd Observation', () async {
      await resourceDao.delete(
          'newPw', null, R4ResourceType.Observation, Id('obs2'), null, null);

      final search = await resourceDao.getResourceType(
        'newPw',
        resourceTypes: [R4ResourceType.Observation],
      );

      expect(search.length, 2);

      final idList = [];
      for (final obs in search) {
        idList.add(obs.id.toString());
      }

      expect(idList.contains('obs1'), true);

      expect(idList.contains('obs2'), false);

      expect(idList.contains('obs3'), true);
    });

    test('Delete All Observations', () async {
      await resourceDao.deleteSingleType('newPw',
          resourceType: R4ResourceType.Observation);

      final search = await resourceDao.getAll('newPw');

      expect(search.length, 2);

      final patList = search.toList();
      final orgList = search.toList();
      patList.retainWhere(
          (resource) => resource.resourceType == R4ResourceType.Patient);
      orgList.retainWhere(
          (resource) => resource.resourceType == R4ResourceType.Organization);

      expect(patList.length, 1);

      expect(patList.length, 1);
    });

    test('Delete All Resources', () async {
      await resourceDao.deleteAllResources('newPw');

      final search = await resourceDao.getAll('newPw');

      expect(search.length, 0);

      await resourceDao.updatePw('newPw', null);
    });
  });
} 

Download Details:

Author: MayJuun

Source Code: https://github.com/MayJuun/fhir/tree/main/fhir_db

#sqflite  #dart  #flutter 

Maurice  Larson

Maurice Larson

1598475240

OpenAI’s Not So Open GPT-3 Can Impact Its Efficacy

With OpenAI’s recent breakthrough of pre-trained language model GPT-3, the company has revolutionised the concept of machines writing codes like humans — a step towards artificial general intelligence. Not only it is being used for writing codes, but also for writing blogs, creating stories as well as websites and apps. In fact, in recent news, a college student created an entirely fake blog using GPT-3, which created a massive buzz in the ML community and has also been trending on Hacker News.

#opinions #gpt-3 #gpt-3 efficacy #open ai gpt-3 #openai #openai sam altman #openai’s gpt-2 model

Top Free Resources To Learn GPT-3 - Analytics India Magazine

With Open AI releasing its avant-garde pre-trained language model — GPT-3 has suddenly become an obsession for the machine learning community, where it can not only generate codes but also human-like stories. Along with its wide range of utilities, it has also surprised the developers and programmers with its generalised intelligence, which is relatively more advanced than the previous pre-trained language models.

Previously, the NLP systems continued to struggle in learning from a few examples; however, with GPT-3, language models can significantly improve with even reaching competitiveness with prior advanced fine-tuning approaches. That being said, to use GPT-3 with 175 billion trainable parameters, developers and programmers must understand what’s going on under the hood of the neural-network-powered language model.

Not only can it be new and complex to understand for first-timers but can also be overwhelming with its big size. With innovation in AI models happening at a much faster rate, understanding GPT-3 can bring out various applications in no time. Here are a few of the free resources that can help developers understand GPT-3 most effectively and can help them get their hands-on this pioneer ML model.


**Also Read: **GPT-3 Has Weaknesses And Makes Silly Mistakes: Sam Altman, OpenAI

OpenAI GPT-3: Beginners Tutorial

From: Bhavesh Bhatt

**About: **OpenAI GPT-3 by Bhavesh Bhatt is a beginners level tutorial presented in a video on his YouTube channel. Bhavesh Bhatt is a data scientist based out of Mumbai, India, who has been working as a Google Developer Expert in Machine Learning. Bhavesh has also been awarded the prestigious 40 Under 40 Data Scientist award by Analytics India Magazine in January 2020. This tutorial not only makes it easy for the users to understand the concept of GPT-3 but also creates a demo that converts a standard English text into SQL like queries using GPT-3. Bhavesh has also created a YouTube video showcasing how one can generate Python Pandas Code and Matplotlib Visualizations by using GPT-3.


#opinions #gpt-3 #gpt-3 tutorials #resources to learn gpt-3 #deep learning

Reach and Limits of the Supermassive Model GPT-3

About This blog post

This blog post provides an explanation of GPT-3 [1]. The summary of the content is as follows.

  • In GPT-2, the predecessor of GPT-3, the authors built a language model with a huge dataset and a huge network, and they got good results without having to train it in each task.
  • In GPT-3, the authors built a language model with an even bigger dataset + an even bigger network, and got great results when the model see dozens of samples.
  • On the other hand, the limitations of scaling up the language model alone for various tasks are becoming apparent.
  • There are also issues of bias towards race, gender, and religion, as well as challenges against willful misuse.

This article goes as follows.

1. Description of the Transformer, GPT-2

2. Concept and Technical Description of GPT-3

3. Tasks that work well using GPT-3

4. Tasks that do not work well using GPT-3

5 . Views on bias and misuse


Transformer

Before we get to GPT-2, the predecessor of GPT-3, let’s take a look at the Transformer Encoder[2] that was used in it. The proposed model, with a provocative title for those who loved LSTM/CNN, also made a splash.

It’s a mechanism called dot product Attention, which is neither CNN nor LSTM, and it’s a model that stacks them together (Transformer), and it far outperforms existing methods.

Image for post

Taken from [2], an overview of the Transformer model

There are three variables, Query, Key, and Value, in (dot-product) Attention that are used in the Transformer. Simply put, the system calculates the Attention Weight of the Query and Key words, and multiplies each Key by the Value associated with it.

Image for post

dot product attention

Multi-Head Attention, which uses multiple Attention Heads (in term of MLP, the “number of hidden layers” is increased), is defined as follows. The (Single Head) Attention in the above figure uses Q,K as it is, but in the Multi-Head Attention, each Head has its own projection matrix W_i^Q, W_i^K, and W_i^V, and the features projected by these matrices are used to create the Attention.

The Q, K, and V used in this dot product Attention are called Self-Attention when they are all derived from the same data. This is called Self-Attention when the Q, K, and V used in this dot product Attention are derived from the same data. In Transformer, the part of the Encoder and the part below the Decoder are Self-Attention. The upper part of the Decoder is not Self-Attention because the Query is brought from Encoder and the K,V are brought from Decoder. The following figure shows the example of attention weights. In this figure, the word “making” is used as a query and the Attention Weight for each key word is calculated and visualized. The each keys in attention heads are learning different dependencies. the words of “key” are colored in multiple ways to represent the Attention Weight of each head.

Image for post


GPT-2

GPT-2 is a research on autoregressive language models using huge datasets and highly expressive language models, and the language models are used as they are to solve various tasks (zero-shot). It consists of three elements: “zero-shot by autoregressive models”, “large scale models” and “a large dataset”.


autoregressive language model and zero-shot

An autoregressive language model refers to a model that defines the probability of occurrence of the next word using the words that have appeared before.

Image for post

Conceptual diagram of an autoregressive language model using RNN. The probability of occurrence of the nth token (word) s_n in sequence x is defined by the conditional probability of s_1~ s_(n-1).

The “zero-shot” method used in GPT-2 is use a language model’s next word predictions as answer of a task. For example, if you want to translate the word “cheese” from English to French as shown in the figure below, you can use s_1~s_(n-1) as “Translate English to French: cheese =>”, and s_n as “fromage” which means cheese in French as output.

Image for post

translation using zero-shot

Highly expressive large scale models

In GPT-2, instead of RNN, the authors use the aforementioned Transformer’s Encoder block (Multi-head Attention + Feed Forward = Transformer Encoder), which is highly expressive, to construct an autoregressive language model. However, instead of using it as it is, they changed the initialization and the network structure slightly. The network structure has been changed from the original Transformer Encoder (on the left figure) to the Transformer Encoder (on the right figure), as shown in the following figure.

Image for post

Quoted from [4], where the Pre-LN Transformer structure is used in GPT-2

The benefits of this structure are discussed in detail in the paper “On Layer Normalization in the Transformer Architecture”[4]. In the original structure, the transformer must use a technique called “warm up”, which allows a small learning rate in the early stages of training (from a few hundred to a few thousand steps) to be used in order for the transformer to learn well (see left-hand side below). On the other hand, the Pre-LN Transformer structure used in GPT-2 ( the name of “Pre-LN” is not used in the GPT-2 paper because the GPT-2 paper was published earlier) limits the size of the gradient by the depth of the layers (L), which makes it possible to train the transformer without warming up and increases the accuracy (see Figure below, right). By the way, if you use Rectified Adam (RAdam), which is an Adam that does not need warm up, you can learn without warm up.

Image for post

Quoted from [4]. (Left) Accuracy and accuracy with and without warm up in Post-LN, the original structure. (right) Pre-LN, the structure adopted by GPT-2, when it is trained without warming up.

GPT-2 proposes four different sized models, and uses the modified Transformer Encoder mentioned above, up to 48 layers. Considering that the original Transformer has only 12 layers (6 layers for Encoder and 6 layers for Decoder), you can see that this is a very large model. In terms of the number of parameters, the largest model has 1.5 billion, which is about 25 times larger than ResNet152 (about 60 million parameters), which is often used for single-image tasks.

Image for post

Quoted from [3]. Four different sizes are available.

A large dataset

They are using a large dataset called WebText for GPT-2 training. As the name implies, this is a set of text data fetched from the web, and it’s a huge dataset of 40 GB in total, containing 8 million documents.

As an aside, on image tasks, there is a trend to achieve high accuracy using large datasets and large models too. And a famous one is BiT [7]. In this study, a huge pre-trained model trained with a huge dataset showed high performance in fine-tuning. The model has 4 times the number of ResNet152 channels and is trained with 300 million data JFT-300M for several months. The results show that fine-tuning can be done with a small amount of data and optimization is fast and efficient.

Image for post

Quoted from [7]

GPT-2 Results

The table below shows the results by GPT-2, which consists of three components: “zero-shot by autoregressive models”, “large scale models” and “a large dataset”. It has updated SOTA with various datasets. Note that GPT-2 did not use fine-tuning or even few-shot learning on each dataset, showing that, as the title of the GPT-2 paper suggests, “Language Models are Unsupervised Multitask Learners”.

#gpt-3 #nlp #deep-learning #deep learning

Let Developers Just Need to Grasp only One Button Component

 From then on, developers only need to master one Button component, which is enough.

Support corners, borders, icons, special effects, loading mode, high-quality Neumorphism style.

Author:Newton(coorchice.cb@alibaba-inc.com)

✨ Features

Rich corner effect

Exquisite border decoration

Gradient effect

Flexible icon support

Intimate Loading mode

Cool interaction Special effects

More sense of space Shadow

High-quality Neumorphism style

🛠 Guide

⚙️ Parameters

🔩 Basic parameters

ParamTypeNecessaryDefaultdesc
onPressedVoidCallbacktruenullClick callback. If null, FButton will enter an unavailable state
onPressedDownVoidCallbackfalsenullCallback when pressed
onPressedUpVoidCallbackfalsenullCallback when lifted
onPressedCancelVoidCallbackfalsenullCallback when cancel is pressed
heightdoublefalsenullheight
widthdoublefalsenullwidth
styleTextStylefalsenulltext style
disableStyleTextStylefalsenullUnavailable text style
alignmentAlignmentfalsenullalignment
textStringfalsenullbutton text
colorColorfalsenullButton color
disabledColorColorfalsenullColor when FButton is unavailable
paddingEdgeInsetsGeometryfalsenullFButton internal spacing
cornerFCornerfalsenullConfigure corners of Widget
cornerStyleFCornerStylefalseFCornerStyle.roundConfigure the corner style of Widget. round-rounded corners, bevel-beveled
strokeColorColorfalseColors.blackBorder color
strokeWidthdoublefalse0Border width. The border will appear when strokeWidth > 0
gradientGradientfalsenullConfigure gradient colors. Will override the color
activeMaskColorColorColors.transparentThe color of the mask when pressed
surfaceStyleFSurfacefalseFSurface.FlatSurface style. Default [FSurface.Flat]. See [FSurface] for details

💫 Effect parameters

ParamTypeNecessaryDefaultdesc
clickEffectboolfalsefalseWhether to enable click effects
hoverColorColorfalsenullFButton color when hovering
onHoverValueChangedfalsenullCallback when the mouse enters/exits the component range
highlightColorColorfalsenullThe color of the FButton when touched. effect:true required

🔳 Shadow parameters

ParamTypeNecessaryDefaultdesc
shadowColorColorfalseColors.greyShadow color
shadowOffsetOffsetfalseOffset.zeroShadow offset
shadowBlurdoublefalse1.0Shadow blur degree, the larger the value, the larger the shadow range

🖼 Icon & Loading parameters

ParamTypeNecessaryDefaultdesc
imageWidgetfalsenullAn icon can be configured for FButton
imageMargindoublefalse6.0Spacing between icon and text
imageAlignmentImageAlignmentfalseImageAlignment.leftRelative position of icon and text
loadingboolfalsefalseWhether to enter the Loading state
loadingWidgetWidgetfalsenullLoading widget in loading state. Will override the default Loading effect
clickLoadingboolfalsefalseWhether to enter Loading state after clicking FButton
loadingColorColorfalsenullLoading colors
loadingStrokeWidthdoublefalse4.0Loading width
hideTextOnLoadingboolfalsefalseWhether to hide text in the loading state
loadingTextStringfalsenullLoading text
loadingSizedoublefalse12Loading size

🍭 Neumorphism Style

ParamTypeNecessaryDefaultdesc
isSupportNeumorphismboolfalsefalseWhether to support the Neumorphism style. Open this item [highlightColor] will be invalid
lightOrientationFLightOrientationfalseFLightOrientation.LeftTopValid when [isSupportNeumorphism] is true. The direction of the light source is divided into four directions: upper left, lower left, upper right, and lower right. Used to control the illumination direction of the light source, which will affect the highlight direction and shadow direction
highlightShadowColorColorfalsenullAfter the Neumorphism style is turned on, the bright shadow color

📺 Demo

🔩 Basic Demo

// FButton #1
FButton(
  height: 40,
  alignment: Alignment.center,
  text: "FButton #1",
  style: TextStyle(color: Colors.white),
  color: Color(0xffffab91),
  onPressed: () {},
)

// FButton #2
FButton(
  padding: const EdgeInsets.fromLTRB(12, 8, 12, 8),
  text: "FButton #2",
  style: TextStyle(color: Colors.white),
  color: Color(0xffffab91),
  corner: FCorner.all(6.0),
)

// FButton #3
FButton(
  padding: const EdgeInsets.fromLTRB(12, 8, 12, 8),
  text: "FButton #3",
  style: TextStyle(color: Colors.white),
  disableStyle: TextStyle(color: Colors.black38),
  color: Color(0xffF8AD36),

  /// set disable Color
  disabledColor: Colors.grey[300],
  corner: FCorner.all(6.0),
)

By simply configuring text andonPressed, you can construct an available FButton.

If onPressed is not set, FButton will be automatically recognized as not unavailable. At this time, ** FButton ** will have a default unavailable status style.

You can also freely configure the style of FButton when it is not available via the disabledXXX attribute.

🎈 Corner & Stroke

// #1
FButton(
  width: 130,
  text: "FButton #1",
  style: TextStyle(color: Colors.white),
  color: Color(0xffFF7043),
  onPressed: () {},
  clickEffect: true,
  
  /// 配置边角大小
  ///
  /// set corner size
  corner: FCorner.all(25),
),

// #2
FButton(
  width: 130,
  text: "FButton #2",
  style: TextStyle(color: Colors.white),
  color: Color(0xffFFA726),
  onPressed: () {},
  clickEffect: true,
  corner: FCorner(
    leftBottomCorner: 40,
    leftTopCorner: 6,
    rightTopCorner: 40,
    rightBottomCorner: 6,
  ),
),

// #3
FButton(
  width: 130,
  text: "FButton #3",
  style: TextStyle(color: Colors.white),
  color: Color(0xffFFc900),
  onPressed: () {},
  clickEffect: true,
  corner: FCorner(leftTopCorner: 10),
  
  /// 设置边角风格
  ///
  /// set corner style
  cornerStyle: FCornerStyle.bevel,
  strokeWidth: 0.5,
  strokeColor: Color(0xffF9A825),
),

// #4
FButton(
  width: 130,
  padding: EdgeInsets.fromLTRB(6, 16, 30, 16),
  text: "FButton #4",
  style: TextStyle(color: Colors.white),
  color: Color(0xff00B0FF),
  onPressed: () {},
  clickEffect: true,
  corner: FCorner(
      rightTopCorner: 25,
      rightBottomCorner: 25),
  cornerStyle: FCornerStyle.bevel,
  strokeWidth: 0.5,
  strokeColor: Color(0xff000000),
),

You can add rounded corners to FButton via the corner property. You can even control each fillet individually。

By default, the corners of FButton are rounded. By setting cornerStyle: FCornerStyle.bevel, you can get a bevel effect.

FButton supports control borders, provided that strokeWidth> 0 can get the effect 🥳.

🌈 Gradient


FButton(
  width: 100,
  height: 60,
  text: "#1",
  style: TextStyle(color: Colors.white),
  color: Color(0xffFFc900),
  
  /// 配置渐变色
  ///
  /// set gradient
  gradient: LinearGradient(colors: [
    Color(0xff00B0FF),
    Color(0xffFFc900),
  ]),
  onPressed: () {},
  clickEffect: true,
  corner: FCorner.all(8),
)

Through the gradient attribute, you can build FButton with gradient colors. You can freely build many types of gradient colors.

🍭 Icon

FButton(
  width: 88,
  height: 38,
  padding: EdgeInsets.all(0),
  text: "Back",
  style: TextStyle(color: Colors.white),
  color: Color(0xffffc900),
  onPressed: () {
    toast(context, "Back!");
  },
  clickEffect: true,
  corner: FCorner(
    leftTopCorner: 25,
    leftBottomCorner: 25,),
  
  /// 配置图标
  /// 
  /// set icon
  image: Icon(
    Icons.arrow_back_ios,
    color: Colors.white,
    size: 12,
  ),

  /// 配置图标与文字的间距
  ///
  /// Configure the spacing between icon and text
  imageMargin: 8,
),

FButton(
  onPressed: () {},
  image: Icon(
    Icons.print,
    color: Colors.grey,
  ),
  imageMargin: 8,

  /// 配置图标与文字相对位置
  ///
  /// Configure the relative position of icons and text
  imageAlignment: ImageAlignment.top,
  text: "Print",
  style: TextStyle(color: textColor),
  color: Colors.transparent,
),

The image property can set an image for FButton and you can adjust the position of the image relative to the text, throughimageAlignment.

If the button does not need a background, just set color: Colors.transparent.

🔥 Effect


FButton(
  width: 200,
  text: "Try Me!",
  style: TextStyle(color: textColor),
  color: Color(0xffffc900),
  onPressed: () {},
  clickEffect: true,
  corner: FCorner.all(9),
  
  /// 配置按下时颜色
  ///
  /// set pressed color
  highlightColor: Color(0xffE65100).withOpacity(0.20),
  
  /// 配置 hover 状态时颜色
  ///
  /// set hover color
  hoverColor: Colors.redAccent.withOpacity(0.16),
),

The highlight color of FButton can be configured through the highlightColor property。

hoverColor can configure the color when the mouse moves to the range of FButton, which will be used during Web development.

🔆 Loading

FButton(
  text: "Click top loading",
  style: TextStyle(color: textColor),
  color: Color(0xffffc900),
  ...

  /// 配置 loading 大小
  /// 
  /// set loading size
  loadingSize: 15,

  /// 配置 loading 与文本的间距
  ///
  // Configure the spacing between loading and text
  imageMargin: 6,
  
  /// 配置 loading 的宽
  ///
  /// set loading width
  loadingStrokeWidth: 2,

  /// 是否支持点击自动开始 loading
  /// 
  /// Whether to support automatic loading by clicking
  clickLoading: true,

  /// 配置 loading 的颜色
  ///
  /// set loading color
  loadingColor: Colors.white,

  /// 配置 loading 状态时的文本
  /// 
  /// set loading text
  loadingText: "Loading...",

  /// 配置 loading 与文本的相对位置
  ///
  /// Configure the relative position of loading and text
  imageAlignment: ImageAlignment.top,
),

// #2
FButton(
  width: 170,
  height: 70,
  text: "Click to loading",
  style: TextStyle(color: textColor),
  color: Color(0xffffc900),
  onPressed: () { },
  ...
  imageMargin: 8,
  loadingSize: 15,
  loadingStrokeWidth: 2,
  clickLoading: true,
  loadingColor: Colors.white,
  loadingText: "Loading...",

  /// loading 时隐藏文本
  ///
  /// Hide text when loading
  hideTextOnLoading: true,
)


FButton(
  width: 170,
  height: 70,
  alignment: Alignment.center,
  text: "Click to loading",
  style: TextStyle(color: Colors.white),
  color: Color(0xff90caf9),
  ...
  imageMargin: 8,
  clickLoading: true,
  hideTextOnLoading: true,

  /// 配置自定义 loading 样式
  ///
  /// Configure custom loading style
  loadingWidget: CupertinoActivityIndicator(),
),

Through the loading attribute, you can configure Loading effects for ** FButton **.

When FButton is in Loading state, FButton will enter an unavailable state, onPress will no longer be triggered, and unavailable styles will also be applied.

At the same time loadingText will overwritetext if it is not null.

The click start Loading effect can be achieved through the clickLoading attribute.

The position of loading will be affected by theimageAlignment attribute.

When hideTextOnLoading: true, if FButton is inloading state, its text will be hidden.

Through loadingWidget, developers can set completely customized loading styles.

Shadow


FButton(
  width: 200,
  text: "Shadow",
  textColor: Colors.white,
  color: Color(0xffffc900),
  onPressed: () {},
  clickEffect: true,
  corner: FCorner.all(28),
  
  /// 配置阴影颜色
  ///
  /// set shadow color
  shadowColor: Colors.black87,

  /// 设置组件高斯与阴影形状卷积的标准偏差。
  /// 
  /// Sets the standard deviation of the component's Gaussian convolution with the shadow shape.
  shadowBlur: _shadowBlur,
),

FButton allows you to configure the color, size, and position of the shadow.

🍭 Neumorphism Style

FButton(

  /// 开启 Neumorphism 支持
  ///
  /// Turn on Neumorphism support
  isSupportNeumorphism: true,

  /// 配置光源方向
  ///
  /// Configure light source direction
  lightOrientation: lightOrientation,

  /// 配置亮部阴影
  ///
  /// Configure highlight shadow
  highlightShadowColor: Colors.white,

  /// 配置暗部阴影
  ///
  /// Configure dark shadows
  shadowColor: mainShadowColor,
  strokeColor: mainBackgroundColor,
  strokeWidth: 3.0,
  width: 190,
  height: 60,
  text: "FWidget",
  style: TextStyle(
      color: mainTextTitleColor, fontSize: neumorphismSize_2_2),
  alignment: Alignment.center,
  color: mainBackgroundColor,
  ...
)

FButton brings an incredible, ultra-high texture Neumorphism style to developers.

Developers only need to configure the isSupportNeumorphism parameter to enable and disable the Neumorphism style.

If you want to adjust the style of Neumorphism, you can make subtle adjustments through several attributes related to Shadow, among which:

shadowColor: configure the shadow of the shadow

highlightShadowColor: configure highlight shadow

FButton also provides lightOrientation parameters, and even allows developers to adjust the care angle, and has obtained different Neumorphism effects.

😃 How to use?

Add dependencies in the project pubspec.yaml file:

🌐 pub dependency

dependencies:
  fbutton: ^<version number>

⚠️ Attention,please go to [pub] (https://pub.dev/packages/fbutton) to get the latest version number of FButton

🖥 git dependencies

dependencies:
  fbutton:
    git:
      url: 'git@github.com:Fliggy-Mobile/fbutton.git'
      ref: '<Branch number or tag number>'

Use this package as a library

Depend on it

Run this command:

With Flutter:

 $ flutter pub add fbutton_nullsafety

This will add a line like this to your package's pubspec.yaml (and run an implicit flutter pub get):

dependencies:
  fbutton_nullsafety: ^5.0.0

Alternatively, your editor might support or flutter pub get. Check the docs for your editor to learn more.

Import it

Now in your Dart code, you can use:

import 'package:fbutton_nullsafety/fbutton_nullsafety.dart';

Download Details:

Author: Fliggy-Mobile

Source Code: https://github.com/Fliggy-Mobile/fbutton

#button  #flutter