[from the editor] A low rank model approximates a table as the (matrix) product of two numerical matrices X and Y.
Using low rank models to understand big data
In many application areas, researchers seek to understand large collections of tabular data, for example, patient lab test results. The values in the table might be numerical (3.14), Boolean (yes, no), ordinal (never, sometimes, always), or categorical (A, B, O). As a practical matter, some entries in the table might also be missing.
And how mutual information is useful in Big Data settings
A deluge of data is transforming science and industry. Many hope that this massive flux of information will reveal new vistas of insight and understanding, but extracting knowledge from Big Data requires appropriate statistical tools. Often, very little can be assumed about the types of patterns lurking in large data sets. In these cases it is important to use statistical methods that do not make strong assumptions about the relationships one hopes to identify and measure.
How to tap into this cost-effective and flexible solution
Biomedical researchers who work with large data sets may run out of both disk space and patience while waiting for a computation to finish. Though buying more hard drives and faster computers may seem tempting, the cloud is now a realistic option.
In 2008, when cloud computing was relatively new, this magazine published a column by Alain Laederach predicting that scientists would be won over to cloud computing, despite some people’s concerns about a loss in performance with the added layer of virtualization.
Cells have a limited repertoire of behaviors and interactions. They grow, divide, die, stick to each other, send and receive signals, change shape, polarize, differentiate (change behaviors), form sheets, secrete, absorb, pull on and remodel extracellular material, and migrate in response to signals in their environment.
How can they help us understand proteins?
Graphs, or networks, have been widely adopted in computational biology, with examples including protein-protein interaction networks, gene regulatory networks, and residue interaction networks in proteins, to name a few.
Set objectives and follow through
Having engineered several scientific software applications for public consumption, the authors know from experience that the process offers unique challenges. Typically, the algorithms being implemented are complex; the process involves numerous developers with various backgrounds and skill sets; and it all takes place in a fast-paced environment where new methods must be prototyped and tested regularly.
Advances in computational power and algorithms have led to longer and more accurate molecular dynamics simulations of protein folding.