Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.idea		.idea
demos		demos
models		models
paper_images		paper_images
pretrained		pretrained
stable_diffusion		stable_diffusion
License.txt		License.txt
README.md		README.md
SEED-1.md		SEED-1.md
demo_recon.py		demo_recon.py
install.sh		install.sh

Repository files navigation

🌰 SEED Multimodal

Powered by CV Center, Tencent AI Lab, and ARC Lab, Tencent PCG.

The repository provides the official implementation of SEED, SEED-LLaMA. For any inquiries, please email [email protected].

News

🍻 We are actively looking for self-motivated interns. Please feel free to reach out if you are interested. 🍻

👀 Release the checkpoints and code of the SEED-2 tokenizer, and SEED-LLaMA-8B/14B. Expected to be in late October.
👀 We will soon release an online demo for SEED-LLaMA.
2023-10-02 📎 We release the technical report of SEED-LLaMA on arXiv, which is empowered by the improved SEED-2 tokenizer.
2023-07-29 We release the checkpoint of the SEED tokenizer and its inference code. [Getting started]
2023-07-16 📎 We release the technical report of SEED on arXiv.

Stay tuned for the updates!

SEED Tokenizer v1

[arXiv]

SEED Tokenizer v1 for Image Reconstruction

SEED-OPT_2.7B for Multimodal Comprehension

SEED-OPT_2.7B for Multimodal Generation

Dependencies and Installation

Python >= 3.8 (Recommend to use Anaconda)
PyTorch >= 1.11.0
NVIDIA GPU + CUDA

Installation

Clone repo

git clone https://github.com/AILab-CVC/SEED.git
cd SEED

Install dependent packages
```
sh install.sh
```

Model Weights

We release the pre-trained SEED Visual Tokenizer in google drive.

Inference

To discretize an image to 1D vision codes with causal dependency, and reconstruct the image from the vision codes using stable diffusion UNet,

Download the pre-trained SEED Visual Tokenizer and stable diffusion model in google drive and put them under the folder "pretrained".
run the inference code.

    python demo_recon.py

Citation

If you find the work helpful, please consider citing:

@article{ge2023making,
  title={Making LLaMA SEE and Draw with SEED Tokenizer},
  author={Ge, Yuying and Zhao, Sijie and Zeng, Ziyun and Ge, Yixiao and Li, Chen and Wang, Xintao and Shan, Ying},
  journal={arXiv preprint arXiv:2310.01218},
  year={2023}
}

@article{ge2023planting,
  title={Planting a seed of vision in large language model},
  author={Ge, Yuying and Ge, Yixiao and Zeng, Ziyun and Wang, Xintao and Shan, Ying},
  journal={arXiv preprint arXiv:2307.08041},
  year={2023}
}

The project is still in progress. Stay tuned for more updates!

License

SEED is released under Apache License Version 2.0.

Acknowledgement

We utilize Stable Diffusion to decode images from our visual codes, and use its implementation and pre-trained model in https://github.com/CompVis/stable-diffusion. Our code is developped based on https://github.com/salesforce/LAVIS. Thanks for their wonderful works.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌰 SEED Multimodal

News

SEED Tokenizer v1

SEED Tokenizer v1 for Image Reconstruction

SEED-OPT_2.7B for Multimodal Comprehension

SEED-OPT_2.7B for Multimodal Generation

Dependencies and Installation

Installation

Model Weights

Inference

Citation

License

Acknowledgement

About

Releases

Packages

Contributors 3

Languages

License

AILab-CVC/SEED

Folders and files

Latest commit

History

Repository files navigation

🌰 SEED Multimodal

News

SEED Tokenizer v1

SEED Tokenizer v1 for Image Reconstruction

SEED-OPT2.7B for Multimodal Comprehension

SEED-OPT2.7B for Multimodal Generation

Dependencies and Installation

Installation

Model Weights

Inference

Citation

License

Acknowledgement

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

SEED-OPT_2.7B for Multimodal Comprehension

SEED-OPT_2.7B for Multimodal Generation

Packages