InnoCentive Founder's Blog - Mitigating the Swiss Cheese Factor in Molecular Libraries

Posted by abingham on Jan 26, 2011 1:43:40 PM

Alph Bingham SmallToday's post, contributed by InnoCentive Founder Alph Bingham, addresses the gaps in most molecular compound libraries, and the new ways InnoCentive is reaching out to Solvers to help close those gaps.

The very concept of molecular space is an intriguing one. But even after decades, it seems a rigorous (rigid?) standard has never developed for defining this space. Are the axes topological alone? Are they graph theoretical and limited to connectivity? Are they electronic? Do they include physical or chemical properties associated more with the nature of the substance than the behavior of a single molecule? Well, yes. All of the above, depending on application, assumptions and objectives.

But even without a rigorous set of definitions, the metaphor alone allows us to think carefully about the design of our experiments, the creation of our libraries (collections of molecules), the subsets we'll screen and the way we'll respond to hits and build out SARs (structure activity relationships). In a perfect world, our libraries are "smooth" not "lumpy." They fill out molecular space evenly, not like swiss cheese with big holes. And the density is a factor under OUR control. We use low density compound collections when we must screen across vast territories and we use high density collections when prior screens and experiences have told us the regions of space wherein lie interesting possibilities.

But it's rarely a perfect world. Our libraries ARE lumpy, they ARE swiss-cheesy, and the cheese bits are more like chunks of lead scattered among marshmallows. Why? Well, several reasons. One is that the libraries tend to be built over time and library-owner's interests and molecular focus have shifted. Add to that, the fact that some compounds and compound classes are easy to make and some are very hard. This leads to a pragmatic filling of molecular space sometimes trumping an ideal one. Even carefully guided SARs in well-bounded regions of space fall prey to testing what can be readily made as opposed to what would best allow the response (activity) to be mapped over the spatial region under study.

Of course, one reason our libraries fall short of ideal is that molecular space is just so darn huge! It has been estimated that there are 10^60 POSSIBLE organic compounds of less than 500 molecular weight. Of course, they haven't been made considering that at a rate of a new molecule every second (way faster than the planet’s current rate of production) it would take trillions and trillions of the universe’s lifetime to make them. (A measly 5x10^17  every 15 billion years). We aren't likely to get that space "filled" any time soon. If testing across that space matters to your company, it's not going to help to just hire a few more chemists next year. In fact, let's admit that the PERFECT solution is PERFECTLY unattainable.

But there are rational approaches to improving libraries, and getting better results from your screening efforts, whether they are looking for new drugs, new agchem products, fragrances or fabric softeners. First, acknowledge you are dealing with "swiss cheese." Start by identifying the regions of space that are lacking (however you defined the axes, you'll find holes). Second, widen your search. Your labs can't do it. They shouldn't. But people have been making novel molecules for a long time and for various reasons. Each of those differentiating reasons has lead to a unique collection of compounds. A few of those compounds are in commercial catalogs, but only a few -- very few actually. Most of them are in small vials with hand-written labels and a covering of scotch-tape. They are buried on shelves, in cabinets, and refrigerated storage areas all over the planet. And, not even the Google trawlers can find them. We are announcing a new search. InnoCentive has been able to effectively search creative minds around the planet for ideas, for inventions and sometimes for things they were about to invent (in response to well-articulated challenges). We'd like to turn that effort toward finding novel molecules scattered around labs and novel molecules you'd like to produce in response to some specifically defined "calls to action."

We've just announced a new Novel Molecules Pavilion, where we will house Challenges posted by Seeker clients wishing to build out their molecular libraries in the most efficient and cost effective manner possible.  Take a look at the pavilion and think about the possibilities available for your own compound library.

Topics: Innovation Insights, Challenges

Follow InnoCentive

Search Blog


On Twitter