It's the first time all known proteins have been collected in a single database, 200 million of them

It's the first time all known proteins have been collected in a single database, 200 million of them

In 2021, Alphabet DeepMind released an open source database containing 3D structures of hundreds of thousands of proteins, including 20,000 known proteins in the human body.

The database has now been expanded to 200 million. It includes almost all known proteins.

Today, it is still difficult for scientists to calculate the exact structure of the protein based on the amino acids it is made up of, usually requiring a huge amount of computing capacity and time, which has been described as the problem of cutting the protein, so progress in this area has been relatively slow.

Now Alphabet has trained a powerful I.I. DeepMind. It was trained in 100,000 known protein structures. This system, according to developers, can predict the structures of millions of other proteins. Each was defined in minutes or seconds, not months or years.

DeepMind has now released a new large-scale update of the database, which now includes some 214 million structures out of a million species, almost all of the proteins currently known to science. It is noted that the database will help to conduct research in the field of disease treatment, vaccine development, and also help solve the problem of resistance to antibiotics.

The entire database, which consists of 25 Terabytes, can be downloaded from Google Cloud.