Zero-Shot Text-Driven Generation and animation Of 3D Avatars in Python

AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars

Fangzhou Hong1*  Mingyuan Zhang1*  Liang Pan1  Zhongang Cai1,2,3  Lei Yang2  Ziwei Liu1+

1S-Lab, Nanyang Technological University  2SenseTime Research  3Shanghai AI Laboratory

*equal contribution  +corresponding author

Accepted to SIGGRAPH 2022 (Journal Track)


AvatarCLIP generate and animate avatars given descriptions of body shapes, appearances and motions.

A tall and skinny female soldier that is arguing.A skinny ninja that is raising both arms.An overweight sumo wrestler that is sitting.A tall and fat Iron Man that is running.

This repository contains the official implementation of AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars.

[Project Page][arXiv][High-Res PDF (166M)][Supplementary Video][Colab Demo]


[09/2022] :fire::fire::fire:If you are looking for a higher-quality text2motion method, go checkout our new work MotionDiffuse!:fire::fire::fire:

[07/2022] Code release for motion generation part!

[05/2022] Paper uploaded to arXiv. arXiv

[05/2022] Add a Colab Demo for avatar generation! Open In Colab

[05/2022] Support converting the generated avatar to the animatable FBX format! Go checkout how to use the FBX models. Or checkout the instructions for the conversion codes.

[05/2022] Code release for avatar generation part!

[04/2022] AvatarCLIP is accepted to SIGGRAPH 2022 (Journal Track):partying_face:!


If you find our work useful for your research, please consider citing the paper:

    title={AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars},
    author={Hong, Fangzhou and Zhang, Mingyuan and Pan, Liang and Cai, Zhongang and Yang, Lei and Liu, Ziwei},
    journal={ACM Transactions on Graphics (TOG)},
    publisher={ACM New York, NY, USA},

Use Generated FBX Models


Go visit our project page. Go to the section 'Avatar Gallery'. Pick a model you like. Click 'Load Model' below. Click 'Download FBX' link at the bottom of the pop-up viewer.

Import to Your Favourite 3D Software (e.g. Blender, Unity3D)

The FBX models are already rigged. Use your motion library to animate it!


Upload to Mixamo

To make use of the rich motion library provided by Mixamo, you can also upload the FBX model to Mixamo. The rigging process is completely automatic!



We recommend using anaconda to manage the python environment. The setup commands below are provided for your reference.

git clone
cd AvatarCLIP
conda create -n AvatarCLIP python=3.7
conda activate AvatarCLIP
conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=10.1 -c pytorch
pip install -r requirements.txt

Other than the above steps, you should also install neural_renderer following its instructions. Before compiling neural_renderer (or after compiling should also be fine), remember to add the following three lines to neural_renderer/ after line 19.

x[z<=0] = 0
y[z<=0] = 0
z[z<=0] = 0

This quick fix is for a rendering issue where objects behide the camera will also be rendered. Be careful when using this fixed version of neural_renderer on your other projects, because this fix will cause the rendering process not differentiable.

To support offscreen rendering for motion visualization, you should install osmesa library.

conda install -c menpo osmesa

Data Preparation

Download SMPL Models

Register and download SMPL models here. Put the downloaded models in the folder smpl_models. The folder structure should look like

├── ...
└── smpl_models/
    ├── smpl/
        ├── SMPL_FEMALE.pkl
        ├── SMPL_MALE.pkl
        └── SMPL_NEUTRAL.pkl

Download Pretrained Models & Other Data

This download is only for coarse shape generation and motion generation. You can skip if you only want to use other parts. Download the pretrained weights and other required data here. Put them in the folder AvatarGen so that the folder structure should look like

├── ...
└── AvatarGen/
    └── ShapeGen/
        └── data/
            ├── codebook.pth
            ├── model_VAE_16.pth
            ├── nongrey_male_0110.jpg
            ├── smpl_uv.mtl
            └── smpl_uv.obj

Pretrained weights and human texture for motion generation can be downloaded here. Note that the human texture we used to render poses is from SURREAL dataset. Besides, you should download pretrained weights of VPoser v2.0. Put them in the folder AvatarAnimate so that the folder structure should look like

├── ...
└── AvatarAnimate/
    └── data/
        ├── codebook.pth
        ├── motion_vae.pth
        ├── pose_realnvp.pth
        ├── nongrey_male_0110.jpg
        ├── smpl_uv.mtl
        ├── smpl_uv.obj
        └── vposer
            ├── V02_05.log
            ├── V02_05.yaml
            └── snapshots
                ├── V02_05_epoch=08_val_loss=0.03.ckpt
                └── V02_05_epoch=13_val_loss=0.03.ckpt

Avatar Generation

Coarse Shape Generation

Folder AvatarGen/ShapeGen contains codes for this part. Run the follow command to generate the coarse shape corresponding to the shape description 'a strong man'. We recommend to use the prompt augmentation 'a 3d rendering of xxx in unreal engine' for better results. The generated coarse body mesh will be stored under AvatarGen/ShapeGen/output/coarse_shape.

python --target_txt 'a 3d rendering of a strong man in unreal engine'

Then we need to render the mesh for initialization of the implicit avatar representation. Use the following command for rendering.

python --coarse_shape_obj output/coarse_shape/a_3d_rendering_of_a_strong_man_in_unreal_engine.obj --output_folder ${RENDER_FOLDER}

Shape Sculpting and Texture Generation

Note that all the codes are tested on NVIDIA V100 (32GB memory). Therefore, in order to run on GPUs with lower memory, please try to scale down the network or tune down max_ray_num in the config files. You can refer to confs/examples_small/example.conf or our colab demo for a scale-down version of AvatarCLIP.

Folder AvatarGen/AppearanceGen contains codes for this part. We provide data, pretrained model and scripts to perform shape sculpting and texture generation on a zero-beta body (mean shape defined by SMPL). We provide many example scripts under AvatarGen/AppearanceGen/confs/examples. For example, if we want to generate 'Abraham Lincoln', which is defined in the config file confs/examples/abrahamlincoln.conf, use the following command.

python --mode train_clip --conf confs/examples/abrahamlincoln.conf

Results will be stored in AvatarCLIP/AvatarGen/AppearanceGen/exp/smpl/examples/abrahamlincoln.

If you wish to perform shape sculpting and texture generation on the previously generated coarse shape. We also provide example config files in confs/base_models/astrongman.conf confs/astrongman/*.conf. Two steps of optimization are required as follows.

# Initilization of the implicit avatar
python --mode train --conf confs/base_models/astrongman.conf
# Shape sculpting and texture generation on the initialized implicit avatar
python --mode train_clip --conf confs/astrongman/hulk.conf

Marching Cube

To extract meshes from the generated implicit avatar, one may use the following command.

python --mode validate_mesh --conf confs/examples/abrahamlincoln.conf

The final high resolution mesh will be stored as AvatarCLIP/AvatarGen/AppearanceGen/exp/smpl/examples/abrahamlincoln/meshes/00030000.ply

Convert Avatar to FBX Format

For the convenience of using the generated avatar with modern graphics pipeline, we also provide scripts to rig the avatar and convert to FBX format. See the instructions here.

Motion Generation

Candidate Poses Generation

Here we provide four different methods for pose generation.

PoseOptimizer: directly optimize on SMPL theta

VPoserOptimizer: optimize the latent space of VPoser

VPoserRealNVP: get latent codes of VPoser from pretrained conditional RealNVP

VPoserCodebook: select the most similar poses to the given text feature

We provide configurations to compare these methods. Here are some examples:

# Suppose your current location is `AvatarCLIP/AvatarAnimate`

# Use PoseOptimizer method to generate poses for "arguing"
python --conf confs/pose_ablation/pose_optimizer/argue.conf
# Results are stored in `AvatarCLIP/AvatarAnimate/exp/pose_ablation/pose_optimizer/argue` directory
# candidate_0.jpg, candidate_1.jpg, ..., candidate_4.jpg are the top-5 poses
# candidate_0.npy, candidate_1.npy, ..., candidate_4.npy are corresponding parameters

# Use VPoserOptimizer method to generate poses for "praying"
python --conf confs/pose_ablation/vposer_optimizer/pray.conf
# Results are stored in `AvatarCLIP/AvatarAnimate/exp/pose_ablation/vposer_optimizer/pray` directory

# Use VPoserRealNVP method to generate poses for "shooting a basketball"
python --conf confs/pose_ablation/vposer_realnvp/shoot_basketball.conf
# Results are stored in `AvatarCLIP/AvatarAnimate/exp/pose_ablation/vposer_realnvp/shoot_basketball` directory

# Use VPoserCodebook method to generate poses for "running"
python --conf confs/pose_ablation/vposer_codebook/run.conf
# Results are stored in `AvatarCLIP/AvatarAnimate/exp/pose_ablation/vposer_codebook/run` directory

Motion Generation

Here we provide three different methods for motion generation.

MotionInterpolation: directly interpolate between given poses

MotionOptimizer (baseline): optimize latent code of a pretrained VAE with a simple reconstruction loss

MotionOptimizer (ours): optimize latent code of a pretrained VAE with weighted reconstruction loss, delta loss, and clip loss

We provide configurations to compare these methods. Here are some examples:

# Suppose your current location is `AvatarCLIP/AvatarAnimate`

# Use MotionInterpolation method to generate motion for "arguing"
python --conf confs/motion_ablation/interpolation/argue.conf
# Results are stored in `AvatarCLIP/AvatarAnimate/exp/motion_ablation/interpolation/argue` directory
# candidate_0.jpg, candidate_1.jpg, ..., candidate_4.jpg are the top-5 poses
# candidate_0.npy, candidate_1.npy, ..., candidate_4.npy are corresponding parameters
# motion.mp4 is the generated motion
# motion.npy is corresponding parameters

# Use MotionOptimizer (baseline) method to generate motion for "praying"
python --conf confs/motion_ablation/baseline/pray.conf
# Results are stored in `AvatarCLIP/AvatarAnimate/exp/motion_ablation/baseline/pray` directory

# Use MotionOptimizer (ours) method to generate motion for "shooting a basketball"
python --conf confs/motion_ablation/motion_optimizer/shoot_basketball.conf
# Results are stored in `AvatarCLIP/AvatarAnimate/exp/motion_ablation/motion_optimizer/shoot_basketball` directory

Make your own configure

Each configuration contains three independent parts: general setting, pose generator, and motion generator.

# General Setting
general {
    # describe the results path
    base_exp_dir = ./exp/motion_ablation/motion_optimizer/raise_arms

    # if you only want to generate poses, then you can set "mode = pose".
    mode = motion

    # define your prompt. We highly recommend using the format "a rendered 3d man is xxx"
    text = a rendered 3d man is raising both arms

# Pose Generator
pose_generator {
    type = VPoserCodebook
    # you can change the number of candidate poses by setting "topk = 10"
    # for PoseOptimizer and VPoserOptimizer, you can further define the number of iterations and the optimizer type

# Motion Generator
# if "mode = pose", you can ignore this part
motion_generator {
    type = MotionOptimizer
    # you can further modify the coefficient of each loss. 
    # for example, if you find the generated motion is very intensive, you can reduce the coefficient of delta loss.


Distributed under the MIT License. See LICENSE for more information.

Related Works

There are lots of wonderful works that inspired our work or came around the same time as ours.

Dream Fields enables zero-shot text-driven general 3D object generation using CLIP and NeRF.

Text2Mesh proposes to edit a template mesh by predicting offsets and colors per vertex using CLIP and differentiable rendering.

CLIP-NeRF can manipulate 3D objects represented by NeRF with natural languages or examplar images by leveraging CLIP.

Text to Mesh facilitates zero-shot text-driven general mesh generation by deforming from a sphere mesh guided by CLIP.

MotionCLIP establishes a projection from the CLIP text space to the motion space through supervised training, which leads to amazing text-driven motion generation results.


This study is supported by NTU NAP, MOE AcRF Tier 2 (T2EP20221-0033), and under the RIE2020 Industry Alignment Fund – Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from the industry partner(s).

We thank the following repositories for their contributions in our implementation: NeuS, smplx, vposer, Smplx2FBX.

Download Details:

Author: hongfz16
Source Code:

License: MIT license


What is GEEK

Buddha Community

Zero-Shot Text-Driven Generation and animation Of 3D Avatars in Python
Ray  Patel

Ray Patel


Lambda, Map, Filter functions in python

Welcome to my Blog, In this article, we will learn python lambda function, Map function, and filter function.

Lambda function in python: Lambda is a one line anonymous function and lambda takes any number of arguments but can only have one expression and python lambda syntax is

Syntax: x = lambda arguments : expression

Now i will show you some python lambda function examples:

#python #anonymous function python #filter function in python #lambda #lambda python 3 #map python #python filter #python filter lambda #python lambda #python lambda examples #python map

Shardul Bhatt

Shardul Bhatt


Why use Python for Software Development

No programming language is pretty much as diverse as Python. It enables building cutting edge applications effortlessly. Developers are as yet investigating the full capability of end-to-end Python development services in various areas. 

By areas, we mean FinTech, HealthTech, InsureTech, Cybersecurity, and that's just the beginning. These are New Economy areas, and Python has the ability to serve every one of them. The vast majority of them require massive computational abilities. Python's code is dynamic and powerful - equipped for taking care of the heavy traffic and substantial algorithmic capacities. 

Programming advancement is multidimensional today. Endeavor programming requires an intelligent application with AI and ML capacities. Shopper based applications require information examination to convey a superior client experience. Netflix, Trello, and Amazon are genuine instances of such applications. Python assists with building them effortlessly. 

5 Reasons to Utilize Python for Programming Web Apps 

Python can do such numerous things that developers can't discover enough reasons to admire it. Python application development isn't restricted to web and enterprise applications. It is exceptionally adaptable and superb for a wide range of uses.

Robust frameworks 

Python is known for its tools and frameworks. There's a structure for everything. Django is helpful for building web applications, venture applications, logical applications, and mathematical processing. Flask is another web improvement framework with no conditions. 

Web2Py, CherryPy, and Falcon offer incredible capabilities to customize Python development services. A large portion of them are open-source frameworks that allow quick turn of events. 

Simple to read and compose 

Python has an improved sentence structure - one that is like the English language. New engineers for Python can undoubtedly understand where they stand in the development process. The simplicity of composing allows quick application building. 

The motivation behind building Python, as said by its maker Guido Van Rossum, was to empower even beginner engineers to comprehend the programming language. The simple coding likewise permits developers to roll out speedy improvements without getting confused by pointless subtleties. 

Utilized by the best 

Alright - Python isn't simply one more programming language. It should have something, which is the reason the business giants use it. Furthermore, that too for different purposes. Developers at Google use Python to assemble framework organization systems, parallel information pusher, code audit, testing and QA, and substantially more. Netflix utilizes Python web development services for its recommendation algorithm and media player. 

Massive community support 

Python has a steadily developing community that offers enormous help. From amateurs to specialists, there's everybody. There are a lot of instructional exercises, documentation, and guides accessible for Python web development solutions. 

Today, numerous universities start with Python, adding to the quantity of individuals in the community. Frequently, Python designers team up on various tasks and help each other with algorithmic, utilitarian, and application critical thinking. 

Progressive applications 

Python is the greatest supporter of data science, Machine Learning, and Artificial Intelligence at any enterprise software development company. Its utilization cases in cutting edge applications are the most compelling motivation for its prosperity. Python is the second most well known tool after R for data analytics.

The simplicity of getting sorted out, overseeing, and visualizing information through unique libraries makes it ideal for data based applications. TensorFlow for neural networks and OpenCV for computer vision are two of Python's most well known use cases for Machine learning applications.


Thinking about the advances in programming and innovation, Python is a YES for an assorted scope of utilizations. Game development, web application development services, GUI advancement, ML and AI improvement, Enterprise and customer applications - every one of them uses Python to its full potential. 

The disadvantages of Python web improvement arrangements are regularly disregarded by developers and organizations because of the advantages it gives. They focus on quality over speed and performance over blunders. That is the reason it's a good idea to utilize Python for building the applications of the future.

#python development services #python development company #python app development #python development #python in web development #python software development

Navigating Between DOM Nodes in JavaScript

In the previous chapters you've learnt how to select individual elements on a web page. But there are many occasions where you need to access a child, parent or ancestor element. See the JavaScript DOM nodes chapter to understand the logical relationships between the nodes in a DOM tree.

DOM node provides several properties and methods that allow you to navigate or traverse through the tree structure of the DOM and make changes very easily. In the following section we will learn how to navigate up, down, and sideways in the DOM tree using JavaScript.

Accessing the Child Nodes

You can use the firstChild and lastChild properties of the DOM node to access the first and last direct child node of a node, respectively. If the node doesn't have any child element, it returns null.


<div id="main">
    <h1 id="title">My Heading</h1>
    <p id="hint"><span>This is some text.</span></p>

var main = document.getElementById("main");
console.log(main.firstChild.nodeName); // Prints: #text

var hint = document.getElementById("hint");
console.log(hint.firstChild.nodeName); // Prints: SPAN

Note: The nodeName is a read-only property that returns the name of the current node as a string. For example, it returns the tag name for element node, #text for text node, #comment for comment node, #document for document node, and so on.

If you notice the above example, the nodeName of the first-child node of the main DIV element returns #text instead of H1. Because, whitespace such as spaces, tabs, newlines, etc. are valid characters and they form #text nodes and become a part of the DOM tree. Therefore, since the <div> tag contains a newline before the <h1> tag, so it will create a #text node.

To avoid the issue with firstChild and lastChild returning #text or #comment nodes, you could alternatively use the firstElementChild and lastElementChild properties to return only the first and last element node, respectively. But, it will not work in IE 9 and earlier.


<div id="main">
    <h1 id="title">My Heading</h1>
    <p id="hint"><span>This is some text.</span></p>

var main = document.getElementById("main");
alert(main.firstElementChild.nodeName); // Outputs: H1 = "red";

var hint = document.getElementById("hint");
alert(hint.firstElementChild.nodeName); // Outputs: SPAN = "blue";

Similarly, you can use the childNodes property to access all child nodes of a given element, where the first child node is assigned index 0. Here's an example:


<div id="main">
    <h1 id="title">My Heading</h1>
    <p id="hint"><span>This is some text.</span></p>

var main = document.getElementById("main");

// First check that the element has child nodes 
if(main.hasChildNodes()) {
    var nodes = main.childNodes;
    // Loop through node list and display node name
    for(var i = 0; i < nodes.length; i++) {

The childNodes returns all child nodes, including non-element nodes like text and comment nodes. To get a collection of only elements, use children property instead.


<div id="main">
    <h1 id="title">My Heading</h1>
    <p id="hint"><span>This is some text.</span></p>

var main = document.getElementById("main");

// First check that the element has child nodes 
if(main.hasChildNodes()) {
    var nodes = main.children;
    // Loop through node list and display node name
    for(var i = 0; i < nodes.length; i++) {


Alec  Nikolaus

Alec Nikolaus


Convert Text to Speech in Python

Learn how to convert your Text into Voice with Python and Google APIs

Text to speech is a process to convert any text into voice. Text to speech project takes words on digital devices and convert them into audio with a button click or finger touch. Text to speech python project is very helpful for people who are struggling with reading.

Project Prerequisites

To implement this project, we will use the basic concepts of Python, Tkinter, gTTS, and playsound libraries.

  • Tkinter is a standard GUI Python library that is one of the fastest and easiest ways to build GUI applications using Tkinter.
  • gTTS (Google Text-to-Speech) is a Python library, which is a very easy library that converts the text into audio.
  • The playsound module is used to play audio files. With this module, we can play a sound file with a single line of code.

To install the required libraries, you can use pip install command:

pip install tkinter
pip install gTTS
pip install playsound

Download Python Text to Speech Project Code

Please download the source code of Text to Speech Project: Python Text to Speech

Text to Speech Python Project

The objective of this project is to convert the text into voice with the click of a button. This project will be developed using Tkinter, gTTs, and playsound library.

In this project, we add a message which we want to convert into voice and click on play button to play the voice of that text message.

  • Importing the modules
  • Create the display window
  • Define functions

So these are the basic steps that we will do in this Python project. Let’s start.

#python tutorials #python project #python project for beginners #python text to speech #text to speech convertor #python

Fancy Font Generator - Fancy Text Generator - Cool & Stylish Text Fonts

𝐹𝒶𝓃𝒸𝓎 𝒯𝑒𝓍𝓉 - Generate Online 😀 ℭ𝔬𝔬𝔩 and ⓢⓣⓨⓛⓘⓢⓗ Text Fonts with Symbols,Imogis and Many Different Styles

Fancy Font Generator - Fancy Text Generator - Cool & Stylish Text Fonts -

Cool and Fancy Text Generator that converts Normal Text To Cool And Fancy. PUBG Mobile Fonts. Cursive Fancy Texts and Emojis. Stylish and Cool Text.

  1. Enter Your Text To Contert it In Fancy Text.
  2. Choose Your Font You Like And Click On Copy.

Welcome To one of the best fancy Font/Text Generator website. on our website you can generate almost unlimited different types of fancy text and Fonts with a mix of symbols, emojis and other different types of characters.

#fancy text generator #fancy font #fancy text #fancy font generator #fancy text font #fancy text font generator