Bioinformatics

Posted by mady | Posted in | Posted on 1:12 AM

1.What is Bioinformatics?

Bioinformatics is the application of computers in biological sciences
and especially analysis of biological sequence data. It is concerned
with capturing, storing, graphically displaying, modeling and
ultimately distributing biological information. It is becoming an
essential tool in molecular biology as genome projects generate vast
quantities of data. With new sequences being added to DNA databases on
an average, once every minute, there is a pressing need to convert
this information into biochemical and biophysical knowledge by
deciphering the structural, functional and evolutionary clues encoded
in the language of biological sequences.

What Bioinformatics therefore offers to the researcher, the
entrepreneur, or the Venture Capitalist is an enormous and exciting
array of opportunities to discover how living systems metabolise,
grow, combat disease, reproduce and regenerate. The current knowledge
represents only the tip of the iceberg. Exciting and startling
discoveries are being made everyday through Bioinformatics, which is
building up an extensive encyclopedia from which life's mysteries will
be unraveled. The importance of computational science in collating
this information and its simultaneous interpretation by biologists is
the underlying ethos of Bioinformatics.

Having an interest in biology and having a strong inclination towards
genetics is all right. But from our point of view, the most important
thing is that biocomputing requires lots of software professionals.
And there is more to do for these people than the experts in biology.

2.Computers and Biology

Bioinformatics is the symbolic relationship between computational and
biological sciences. The ability to sort and extricate genetic codes
from a human genomic database of 3 billion base pairs of DNA in a
meaningful way is perhaps the simplest form of Bioinformatics. Moving
on to another level, Bioinformatics is useful in mapping different
people's genomes and deriving differences in their genetic make-up
through pattern recognition software. But that is the easiest part.
What is more complex is to decipher the genetic code itself to see
what the differences in genetic make-up between different people
translate into in terms of physiological traits. And there is yet
another level, which is even more intricate and that is the genetic
code itself. The genetic code actually codes for amino acids and
thereby proteins and the specific role, played by each of these
proteins controls the state of our health. The role or function of
each of our genes in coding for a specific protein, which in turn
regulates a particular metabolic pathway, is described as "functional
genomics". The true benefit of Bioinformatics therefore lies in
harnessing information pertaining to these genetic functions in order
to understand how human beings and other living systems operate.

Computational simulation of experimental biology is an important
application of Bioinformatics, which is referred to as "in silico"
testing. This is perhaps an area that will expand in a prolific way,
given the need to obtain a greater degree of predictability in animal
and human clinical trials. Added to this, is the interesting scope
that "in silico" testing provides to deal with the growing hostility
towards animal testing. The growth of this sector will largely depend
on the acceptance of "in silico" testing by the regulatory
authorities. However, irrespective of this, research strategies will
certainly find computational modeling to be a vital tool in speeding
up research with enormous cost benefits.

3.Limitations in the use of computers

The last decade has witnessed the dawn of a new era of
'silicon-based' biology, opening the door, for the first time, to the
possible investigation and comparative analysis of complete genomes.
Genome analysis means to elucidate and characterize the genes and gene
products of an organism. It depends on a number of pivotal concepts,
concerning the processes of evolution (divergence and convergence),
the mechanism of protein folding, and the manifestation of protein
function.

Today, our use of computers to model such processes is limited by,
and must be placed in the context of, the current limits of our
understanding of these central themes. At the outset, it is important
to recognize that we do not yet fully understand the rules of protein
folding; we cannot invariably say that a particular sequence or a fold
has arisen by divergent or convergent evolution; and we cannot
necessarily diagnose a protein function, given knowledge only of its
sequence or of its structure, in isolation. Accepting what we cannot
do with computers plays an essential role in forming an appreciation
of what, in fact, we can do. Without this kind of understanding, it is
easy to be misled, as spurious arguments are often used to promote
perhaps rather overenthusiastic points of view about what particular
programs and software packages can achieve.

Nature has its own complex rules, which we only poorly understand and
which we cannot easily encapsulate within computer programs. No
current algorithm can 'do' biology. Programs provide mathematical and
therefore infallible, models of biological systems. To interpret
correctly whether sequences or structures are meaningfully similar,
whether they have arisen by the processes of divergence or
convergence, whether similar sequences or similar folds have the same
or different functions: these are the most challenging problems. There
are no simple solutions, and computers do not give us the answers;
rather, given a sea of data, they help to narrow the options down so
that the users can begin to draw informed biologically reasonable
conclusions.

4.Current Stage of Research

In the field of Bioinformatics, the current research drive is to be
able to understand evolutionary relationships in terms of the
expression of protein function. Two computational approaches have been
brought up to bear on the problem, tackling the identification of
protein function from the perspectives of sequence analysis and of
structure analysis respectively. From the point of view of sequence
analysis, we are concerned with the detection of relationships between
newly determined sequences and those of known function (usually within
a database). This may mean pinpointing functional sites shared by
disparate proteins (probably the result of convergent evolution), or
identifying related functions in similar proteins (most commonly the
result of divergent evolution.

The identification of protein function from sequence sounds
straightforward, and indeed, sequence analysis is usually a fruitful
technique. But, function cannot be inferred from sequence for about
one-third of proteins in any of the sequenced genomes, largely because
biological characterization cannot keep pace with the volume of data
issuing from the genome projects (large number of database sequences
thus either carry no annotation beyond the parent gene name, or are
simply designated as hypothetical proteins). Another important point
is that, in some instances, closely related sequences, which may be
assumed to share a common structure, may not share the same function.
What this means is that, though sequence or structure analysis can be
used for deducing gene functions, still neither technique can be
applied infallibly without reference to the underlying biology.

5.Microbial, Plant and Animal Genomes

Although the human genome appears to be the focal point of interest,
microbial, plant and animal genomes are equally exciting to explore
through Bioinformatics. Mining plant genomics has an important impact
on opening up new vistas for research in agriculture. Microbial
genomics offers a dual opportunity of developing new
fermentation-based products and technologies as well as defining new
ways of combating microbial infections. Exploring animal genomics
opens up unlimited scope to pursue research in veterinary science and
transgenic models.

6.History (Stages of development)

The science of sequencing began slowly. The earliest techniques were
based on methods for separation of proteins and peptides, coupled with
methods for identification and quantification of amino acids. Prior
to 1945, there was not a single quantitative analysis available for
any one protein. However, significant progress with chromatographic
and labeling techniques over the next decade eventually led to the
elucidation of the first complete sequence, that of the peptide
hormone insulin (1955). Yet, it was the first five years before the
sequence of the first enzyme (ribonuclease) was complete (1960). By
1965, around 20 proteins with more than 100 residues had been
sequenced, and by 1980, the number was estimated to be of the order of
1500. Today, there are more than 3,00,000 sequences available.

Initially, the manual process of sequential Edman degradation –
dansylation, obtained the majority of protein sequences. A key step
towards the rapid increase in the number of sequenced proteins was the
development of automated sequencers, which, by 1980, offered a
104-fold increase in sensitivity relative to the automated procedure
implemented by Edman and Begg in 1967.

In the 1960s, scientists struggled to develop methods to sequence
nucleic acids, but the first techniques to emerge were really only
applicable to tRNA because they are short (74 to 95 nucleotides in
length) and it is possible to purify individual molecules.

As against RNA, DNA is very long: human chromosomal molecules may
contain between 55*106 and 250*106 base pairs. Assembling the complete
nucleotide sequence of a complete DNA molecule is a huge task. Even if
the sequence can be broken down into smaller fragments, purification
remains a problem. The advent of gene cloning provided a solution to
how the fragments can be separated. By 1977, two sequencing methods
had emerged using chain termination and chemical degradation
approaches. With only minor changes, the techniques propagated to
laboratories throughout the world, and laid the foundation for the
sequence revolution of the next two decades, and the subsequent birth
of Bioinformatics.

During the last decade, molecular biology has witnessed an
information revolution as a result of both of the development of rapid
DNA sequencing techniques and of the corresponding progress in
computer base technologies, which are allowing us to cope with this
information deluge in increasingly efficient ways. The broad term that
was coined in the mid-1980s to encompass computer applications in
biological sciences is Bioinformatics. The term Bioinformatics has
been commandeered by several different disciplines to mean rather
different things. In its broadest sense, the term can be considered to
mean information technology applied to the management and analysis of
biological sequence data; this has implications in diverse areas,
ranging from artificial intelligence and robotics to genome analysis.
In the context of genome initiatives, the term was originally applied
to the computational manipulation and analysis of biological sequence
data. However in view of this recent rapid accumulation of available
protein structures, the term now tends also to be used to embrace the
manipulation and analysis of 3D structure data.

Comments (3)

I am very grateful you did share your knowledge here. Claas 2 Digital Signature Certificate

I am very grateful you did share your knowledge here. It is an excellent post. Class 2 Digital Signature Certificate

So nice blog...... It gives many more things. Thanks for this blog.
Digital Signature Mart

Post a Comment