Comment: Critical Point Physics World  November 2020

The bank of success

Community minded Helen Berman, who helped pioneer the Protein Data Bank. (Courtesy: Photograph by Nick Romanenko, © Rutgers University)

After helping to set up the first major repository of protein structures 50 years ago, Helen Berman is now significantly expanding its scope, as Robert P Crease finds out

When Helen Berman was working as an X-ray crystallographer at the Institute for Cancer Research (ICR) in Philadelphia in 1969, she and a handful of other colleagues realized that the field was about to be inundated. Until then, determining the structure of proteins – describing the positions of its individual atoms – was a time-consuming process, requiring complex diffraction data to be produced and interpreted by hand. Most proteins have tens of thousands of atoms, and back then barely a dozen or so protein structures had been determined.

But Berman, who was an ICR research associate, realized that improved instrumentation and new computational methods would soon send the number of protein structures skyrocketing. A proper database able to archive voluminous numbers of structures in a standard form was urgently needed, she felt. Accomplishing that goal would require extensive computing capabilities and state-of-the-art computer graphics.

And so it was that at a 1971 conference at the Cold Spring Harbor Laboratory on Long Island, New York, Berman and her colleagues broached the subject with Walter Hamilton, a renowned crystallographer from Brookhaven National Laboratory, inspiring him to create the Protein Data Bank (PDB) that year. Berman’s intuition was correct: entries in the PDB grew exponentially from seven in 1971 to 100 by 1982, 1000 in 1993 and 10,000 in 1999. Today the PDB has almost 170,000, making it the most important open-access, digital-data resource in biology.

Off-the-scale problems

But in 1998, when Berman moved the PDB to Rutgers University, she sensed another imminent crisis. While initially X-ray crystallography was the main tool for determining protein structures, it was soon joined by nuclear magnetic resonance spectroscopy (NMR) and cryogenic electron microscopy (cryo-EM). Researchers were therefore increasingly having to turn to “integrative” structure determination based on data from multiple experimental methods, not just cryo-EM and NMR but also chemical crosslinking, small angle scattering, Förster resonance energy transfer and others.

Berman thought this pointlessly hampered research since users had to navigate different data management practices. Seeking a way to allow these differently determined protein models to be entered into a single repository, she immediately realised that each community had not only created different ways to store and manage data, but had also developed a different language, standards, criteria for acceptability, and different limits on the possible positions of individual atoms in a protein.

For a while, Berman overcame this obstacle by fitting integrative structures into the PDB case-by-case, but it was gruelling work. As she told me, she realized, “No way this is gonna scale!” Berman tried for a while to expand the PDB standards and give them to the different communities involved in integrative modelling, but it didn’t go well. Each community worked differently, making sense of data using procedures and standards appropriate to their research; their members feared that new standards imposed from outside would interfere with creativity and success.

Reactions ranged from polite lack of interest to outright hostility: the day after one presentation, for example, a researcher sent Berman an offensive message insisting he had no intention of doing what “Madame President” wanted. Her first thought was: “How can I get this guy on my side?” Figuring that out wasn’t easy. Or as Berman politely puts it: “Scientists are not usually skilled at social engineering.”

She therefore decided to leave the standards to the communities themselves, befriending members, tracking down the experts, and explaining how success would benefit their communities. Most importantly, she listened to why they felt certain things might be impossible in practice. “That way, the leaders of each community remained the leaders,” says Andrej Sali, a structural biologist at the University of California, San Francisco, who works with Berman. “They kept taking care of their corner of the universe.”

However, the communities needed to figure out ways to exchange data and communicate with each other. The result was PDB-Dev – a flexible test platform that can accommodate different types of integrative structures before they are eventually archived in the PDB. It succeeded in bringing structural biologists and the different experimental communities together, with two committees – one to archive models and exchange data, and the other to validate different models. PDB-Dev released its first structure in 2016.

Berman stepped down as the head of the PDB in 2014, but continues her involvement in it, including leading a workshop last year. “It’s a work in progress,” she admits. “As each structure comes in, we find a new set of problems that we hadn’t thought of before.” PDB-Dev now has 61 structures and aims to fold into the PDB over the next five years.

The critical point

Berman attributes some of her success to her own family background: her father was a surgeon and professor at a medical school, while her mother was a community health organizer in poor neighbourhoods of Brooklyn. “Her talent was to get people to talk together,” Berman said. “The next step is to show people that there’s a problem.” In much the same way, she feels, her skill lies in encouraging scientists to speak outside their technical vocabularies in language those in other communities can understand – including why some things could not work.

After one of Berman’s presentations, an audience member told her that what she was doing sounded like the work of the American political economist Elinor Ostrom (1933–2012). As someone who specialized in describing the principles of managing resources in diverse communities, Ostrom found that the wrong approach was to seek a “theory” for managing combined practices of different groups; the key is to get them to work together first. Or, as Berman puts it: “Science is alive, a fluid thing. Making it go is community work.”