A code-breaking programme which could hold the key to unlocking some of nature's most hidden medical secrets has been road-tested on a Jane Austen novel.

It is hoped the algorithm developed by the University of Bradford's Professor Simon Shepherd will reveal the mysteries of our genes and pave the way towards new treatments and drugs for diseases sooner than expected.

And it is being tested on the novel Emma, published in 1816.

The classic tale of Emma Woodhouse and her attempts at matchmaking in polite Regency English society could help scientists further understand the complexities of DNA sequences and genes.

Mr Shepherd, a professor of computational mathematics, has been working on genomics with Professor Clive Beggs, a professor of medical technology, and Dr Sue Jones, a lecturer in biomedical science, in Bradford's Medical Biophysics Research Group.

To test the algorithm Mr Shepherd ran it through the text of the novel which has had numerous big-screen adaptations including a Hollywood version with Gwyneth Paltrow as the eponymous heroine.

All spaces and punctuation were removed - converting the revered literary text into a string of unintelligible letters.

But when Mr Shepherd tested the algorithm on Austen he found the programme succeeded in recognising patterns of letters, identifying 80 per cent of the words, and separated them back into sentences.

He now believes that the programme can be applied to the genetic sequence in the human body which contains around three billion letters and has so far baffled scientists trying to interpret it.

It is thought cellular processes might be regulated and controlled by uncracked codes which the algorithm might be able to decipher.

The research will be part of the massive Human Genome Project which sees scientists all over the world trying to map the codes of the entire human genome, which includes billions of DNA sequences making up thousands of genes Mr Shepherd said: "We are treating DNA as we used to treat problems in intelligence. We want to break the code at the most fundamental level."

He said about two metres of DNA was fitted into the nucleus of a human cell.

The nucleus is just a few micrometres in diameter, meaning the DNA is packed together with proteins in a complex hierarchy of "folding back and wrapping around".

Mr Shepherd said: "The protein folding problem is regarded as one of the three grand challenges of 21st century science.

"Its resolution is crucial to the development of the new drugs and medical therapies that the Human Genome project promises one day to deliver."

He said he hoped to get results from a combination of elements including numerate sciences such as mathematics and engineering and an expert knowledge of biochemistry at a cellular level.

"Although results will not happen overnight, we can expect to see the promise of the Human Genome project bearing fruit within the next 20 to 50 years," he said.

e-mail: rebecca.wright @bradford.newsquest.co.uk

SO WHAT'S AN ALGORITHM?

  • In very simple terms an algorithm is a mathematical problem-solving procedure using a finite number of steps.
  • During the Second World War a machine was devised to create an algorithm to crack Enigma - the German's communication code. The task was undertaken at Bletchley Park in Milton Keynes, which is now open to the public as a museum.
  • The concept of an algorithm originated as a means of recording procedures for solving mathematical problems such as finding the common divisor of two numbers or multiplying two numbers.
  • Most algorithms can be directly implemented by computer programs. Any other algorithms can in theory be simulated by computer programs.