Latent Semantic Indexing (LSI): What it is, how it works and what it means
Latent Semantic Indexing (LSI): What it is, how it works and what it means
LSI is a methodology for automatic document classification. It examines all the words in all the
documents of a corpus and calculates similarity measurements for each document or for
individual terms. It can gauge very accurately which documents in a corpus are really relevant to
a search phrase even if that search phrase does not appear in a document. Measuring relevancy
is a key component of a search engine’s ranking algorithm. When search engines use it, LSI can
have a significant impact on the ranking of your web pages.
What it is and how it works:
How can a search engine tell the difference between relevant information and irrelevant
information? Some search engines use LSI to achieve this goal. LSI helps improve a search
engine’s performance in three significant tasks: recall, precision, and ranking. Recall is getting all of the relevant information available for your search. Precision is getting only the information that is relevant to your search. Ranking is getting all the information ordered in a meaningful way
from the most relevant to the least, for example.
When you query a search engine that uses LSI, the search engine examines similarity values
calculated for every content word. This method examines the document collection as a whole and
knows which documents are semantically close or distant based on the relationships between all
the words in each document and all the words in the rest of the collection. LSI does not require an
exact match to the query phrase to find relevant documents.
What this means for the search engine optimization (SEO) specialist and anyone with a website
who wants high visibility in the search engines is that every word on your web page is important,
not just the keyphrase(s). It is the right combination of all the words in your content that really
matters here. What you do with your keyphrase(s) is still important but now you must go beyond
that . . . way beyond that. You’ve got to have the right context to support your keyphrase(s).
Because LSI correlates surprisingly well with how we as humans might classify a document
collection, writing content that performs well under LSI analysis is not like writing contrived,
robotic styled verbiage for a machine. It involves giving proper attention both to persuasive, well
written copy and to semantics. It is a delicate balance of art and science.
It forces you to write more relevant, more compelling content. This is good for the search engines
because it increases the quality of the content in their databases. This is also good for your
business because you’ll have content that generates more traffic and more conversions. A proper
solution will involve principles from multiple disciplines: computer science, information theory, and human psychology that dovetail very nicely with time tested marketing principles.
Insight from the past:
The combined insights from three well-known figures in history, a theologian, a painter, and a
mathematician, can explain why.
• What does a 5th century Theologian (St. Augustine of Hippo) know about lasting joy
and human desire that can help you transform your web site into one that converts more
visitors to customers?!
• What does a 19th century Painter (Claude Monet) know about beauty and imagination
that will help you make your web site irresistible to the search engines?!
Fortune Interactive: The Future of Search Marketing
3200 Atlantic Avenue, Suite 100, Raleigh, NC 27604, 1-888-SEMLOGIC
• What does a 20th century Mathematician (Benoit Mandelbrot) know about the link
between words and chaos theory that will help you both persuade your visitor and satisfy
the search engines in one stroke?!
If this trio, Augustine, Monet, and Mandelbrot, sat down to dinner and had a discussion about how
search engines do what they do, about search engine marketing, and about your website, here is
what you might hear in that conversation. You might hear Augustine talk about why when you
write with LSI in mind, you will also have content that is compelling to your human audience. You
might hear Mandelbrot talk about why the math and artificial intelligence behind LSI tracks so well
with how humans use words and would organize documents. You might hear Monet talk about
the importance of creating a great context to support and enhance your keyphrase(s).
Augustine, Mandelbrot and Monet
Augustine's Law
It is our nature to be attracted to that which is beyond the ability of our minds to fully comprehend.
This is what Roy H. Williams calls Augustine's Law. In On the Trinity, Augustine comments that
the seeker in Psalm 105 finds the joy described therein only "... when one has been able to find
how incomprehensible that is which he was seeking..." Examples of some phenomena in nature
toward which we are drawn are ocean waves, cloud formations, mountains, lightning, and
snowflakes. All these examples share something in common. The elegant order in each can be
described in mathematics by the science of chaos. Unpredictability, information theory (the
foundation for LSI), and chaos are very closely related. Augustines Law would hold that writing
copy with these principles in mind will add to the appeal of your message.
Mandelbrot's Fractals
Computers use mathematical equations to produce images called fractals which are maps of
chaotic systems (e.g. population fluctuations, chemical reactions, and clouds). Mandelbrot
actually created fractal images mapping the variations in stock market prices and, more important
to our topic, the probabilities of word occurrences in English. Suffice to say, the way we use
language can be described by mathematical equations similar to those that describe other chaotic
systems. This is why something seemingly as mathematical and abstract as the principles and
concepts underlying LSI track so well with how humans use words and organize documents.
Monet's Impressionism
The term Impressionism was derived from a painting by Claude Monet, Impression: Sunrise
(1872). In this style, Monet would capture the ever-changing effects of sunlight on their
surroundings and the technique allowed him to be responsive both to the character and texture of
an object in nature and to the impact of light on its surfaces. He was able to engage the
imagination because he realized the importance of context in his painting technique. The color of
an object is modified by the light in which it is seen, by reflections from other objects, and by its contrast with juxtaposed colors.
Similarly, the color (sense of meaning) of a word is modified by the context in which it is seen,
reflections from words near it, and by contrast with words juxtaposed to it. If you write with words
the way Monet painted with colors, you will engage the imagination of your audience as well. Roy
H. Williams speaks of this with regard to traditional marketing; it is even more important with
regard to search engines using LSI. Since LSI can help tell you what that context should be for a
word or phrase, Monet would highly recommend it as a great tool to support and enhance
keyphrase(s) in search engine marketing. Doing so would please the search engines and
captivate your audience.
What this means:
Latent Semantic Indexing (LSI) is a highly beneficial technique for search engines to use for
improving recall, precision, and ranking. In a future article, I will discuss in more detail how LSI is actually used within search engine algorithms. As an added benefit, by using LSI, search engines
provide an incentive for web copywriters and SEO professionals alike to produce better content in
their web pages. This, in turn, increases the quality of a search engine’s database.
In your business, is visibility in the search engines and your online presence important to you?
Then you need to understand the impact LSI has on your search marketing goals. You need to
experience the benefits a proper understanding of LSI can deliver when it becomes an integral
part of your search marketing efforts. At Fortune Interactive, we have technology, staff, and
expertise unique in the industry to help your search marketing efforts and your business reap
those benefits.
Latent Semantic Indexing LSI What it is how it works and what it means - To learn more about this author, visit Michael Marshall's Website.
Like this article? Share it with your friends
Latent Semantic Indexing (LSI): What it is, how it works and what it means
LSI is a methodology for automatic document classification. It examines all the words in all the
documents of a corpus and calculates similarity measurements for each document or for
individual terms. It can gauge very accurately which documents in a corpus are really relevant to
a search phrase even if that search phrase does not appear in a document. Measuring relevancy
is a key component of a search engine’s ranking algorithm. When search engines use it, LSI can
have a significant impact on the ranking of your web pages.
What it is and how it works:
How can a search engine tell the difference between relevant information and irrelevant
information? Some search engines use LSI to achieve this goal. LSI helps improve a search
engine’s performance in three significant tasks: recall, precision, and ranking. Recall is getting all of the relevant information available for your search. Precision is getting only the information that is relevant to your search. Ranking is getting all the information ordered in a meaningful way
from the most relevant to the least, for example.
When you query a search engine that uses LSI, the search engine examines similarity values
calculated for every content word. This method examines the document collection as a whole and
knows which documents are semantically close or distant based on the relationships between all
the words in each document and all the words in the rest of the collection. LSI does not require an
exact match to the query phrase to find relevant documents.
What this means for the search engine optimization (SEO) specialist and anyone with a website
who wants high visibility in the search engines is that every word on your web page is important,
not just the keyphrase(s). It is the right combination of all the words in your content that really
matters here. What you do with your keyphrase(s) is still important but now you must go beyond
that . . . way beyond that. You’ve got to have the right context to support your keyphrase(s).
Because LSI correlates surprisingly well with how we as humans might classify a document
collection, writing content that performs well under LSI analysis is not like writing contrived,
robotic styled verbiage for a machine. It involves giving proper attention both to persuasive, well
written copy and to semantics. It is a delicate balance of art and science.
It forces you to write more relevant, more compelling content. This is good for the search engines
because it increases the quality of the content in their databases. This is also good for your
business because you’ll have content that generates more traffic and more conversions. A proper
solution will involve principles from multiple disciplines: computer science, information theory, and human psychology that dovetail very nicely with time tested marketing principles.
Insight from the past:
The combined insights from three well-known figures in history, a theologian, a painter, and a
mathematician, can explain why.
• What does a 5th century Theologian (St. Augustine of Hippo) know about lasting joy
and human desire that can help you transform your web site into one that converts more
visitors to customers?!
• What does a 19th century Painter (Claude Monet) know about beauty and imagination
that will help you make your web site irresistible to the search engines?!
Fortune Interactive: The Future of Search Marketing
3200 Atlantic Avenue, Suite 100, Raleigh, NC 27604, 1-888-SEMLOGIC
• What does a 20th century Mathematician (Benoit Mandelbrot) know about the link
between words and chaos theory that will help you both persuade your visitor and satisfy
the search engines in one stroke?!
If this trio, Augustine, Monet, and Mandelbrot, sat down to dinner and had a discussion about how
search engines do what they do, about search engine marketing, and about your website, here is
what you might hear in that conversation. You might hear Augustine talk about why when you
write with LSI in mind, you will also have content that is compelling to your human audience. You
might hear Mandelbrot talk about why the math and artificial intelligence behind LSI tracks so well
with how humans use words and would organize documents. You might hear Monet talk about
the importance of creating a great context to support and enhance your keyphrase(s).
Augustine, Mandelbrot and Monet
Augustine's Law
It is our nature to be attracted to that which is beyond the ability of our minds to fully comprehend.
This is what Roy H. Williams calls Augustine's Law. In On the Trinity, Augustine comments that
the seeker in Psalm 105 finds the joy described therein only "... when one has been able to find
how incomprehensible that is which he was seeking..." Examples of some phenomena in nature
toward which we are drawn are ocean waves, cloud formations, mountains, lightning, and
snowflakes. All these examples share something in common. The elegant order in each can be
described in mathematics by the science of chaos. Unpredictability, information theory (the
foundation for LSI), and chaos are very closely related. Augustines Law would hold that writing
copy with these principles in mind will add to the appeal of your message.
Mandelbrot's Fractals
Computers use mathematical equations to produce images called fractals which are maps of
chaotic systems (e.g. population fluctuations, chemical reactions, and clouds). Mandelbrot
actually created fractal images mapping the variations in stock market prices and, more important
to our topic, the probabilities of word occurrences in English. Suffice to say, the way we use
language can be described by mathematical equations similar to those that describe other chaotic
systems. This is why something seemingly as mathematical and abstract as the principles and
concepts underlying LSI track so well with how humans use words and organize documents.
Monet's Impressionism
The term Impressionism was derived from a painting by Claude Monet, Impression: Sunrise
(1872). In this style, Monet would capture the ever-changing effects of sunlight on their
surroundings and the technique allowed him to be responsive both to the character and texture of
an object in nature and to the impact of light on its surfaces. He was able to engage the
imagination because he realized the importance of context in his painting technique. The color of
an object is modified by the light in which it is seen, by reflections from other objects, and by its contrast with juxtaposed colors.
Similarly, the color (sense of meaning) of a word is modified by the context in which it is seen,
reflections from words near it, and by contrast with words juxtaposed to it. If you write with words
the way Monet painted with colors, you will engage the imagination of your audience as well. Roy
H. Williams speaks of this with regard to traditional marketing; it is even more important with
regard to search engines using LSI. Since LSI can help tell you what that context should be for a
word or phrase, Monet would highly recommend it as a great tool to support and enhance
keyphrase(s) in search engine marketing. Doing so would please the search engines and
captivate your audience.
What this means:
Latent Semantic Indexing (LSI) is a highly beneficial technique for search engines to use for
improving recall, precision, and ranking. In a future article, I will discuss in more detail how LSI is actually used within search engine algorithms. As an added benefit, by using LSI, search engines
provide an incentive for web copywriters and SEO professionals alike to produce better content in
their web pages. This, in turn, increases the quality of a search engine’s database.
In your business, is visibility in the search engines and your online presence important to you?
Then you need to understand the impact LSI has on your search marketing goals. You need to
experience the benefits a proper understanding of LSI can deliver when it becomes an integral
part of your search marketing efforts. At Fortune Interactive, we have technology, staff, and
expertise unique in the industry to help your search marketing efforts and your business reap
those benefits.
Latent Semantic Indexing LSI What it is how it works and what it means - To learn more about this author, visit Michael Marshall's Website.
Like this article? Share it with your friends
| |||
| No article feedback found. | |||
| Leave Your Feedback | |||
|
|||
|
| |||
| Learn about Latent Semantic Indexing (LSI) and Phrase based Indexing and Retrieval (PaIR) in a simplified manner. |
|||
|
| |||
| LSI, or Latent Semantic Indexing, is literally the indexing of documents based on related or semantically related keywords. Google, in particular, is widely believed to use LSI or a similar form of topic based index... |
|||
|
| |||
| In many ways, keyword research is what differentiates effective Internet marketing from non-targeted and poor performing marketing. |
|||
|
| |||
| Search engines have their own inherent language called latent semantic indexing where association and correlation are predetermined for keywords & phrases. This article provides valuable insight to how search engine... |
|||
|
| |||
| One of the easiest and most effective ways to improve your search engine rankings is to create unique, keyword rich title tags for each of your pages. Be sure your keyword is as close to the beginning of the tag as ... |
|||
| |||
|
To learn more about the Evan Elite Author Program please contact us. | |||
![]() | |
![]() Michael Marshall (Visit Michael's Website) I have over 19 years experience in information technology covering a wide range of specialties including: web design, software engineering, e-commerce solutions, artificial intelligence, and Internet marketing. I have degrees in Linguistics, Philosophy and Theology. I am also a contributing author to SEOToday.com, the premier website for SEM professionals, and a contributor to "Building Your Business With Google for Dummies" by Brad Hill. I am a frequent presenter at Ultra Advanced SEO Symposiums, a meeting of select masters of the search engine marketing industry, at Search Engine Workshops. I am also a certified instructor at the Raleigh-Durham-Chapel Hill Search Engine Academy, an SEO certification program approved by the US educational system. Three SEO Tools I've created: SEM SCOUT - Content Relevance SEO Sniper - Keyphrase Research SEO Recon - SEO Competitive Intelligence
| |
![]() |
|
|
![]() |
|
|
|
![]() |
| Modeling the Masters: Learn the true secrets behind Walt Disney's business success factors & grow your company! Video produced by Phanta Media |
|
|
![]() |
| Have you written articles that would be of value to entrepreneurs? Become an expert on our site by publishing them! Expose yourself to a wide audience, drive more traffic to your website and get more sales! Click Here for details. |
|
|
![]() | ||
|
| ||
|
|
|
Get advice & tips from famous business owners, new articles by entrepreneur experts, my latest website updates, & special sneak peaks at what's to come!
|
![]() |
|
|
![]() | ||
|
Top 50 Marketing Blogs
Top Blogs To Watch In 2008 | ||
|
Write The PR
Press Release Builder | ||
![]() | ||
|
|
|
|
|
|||||||||||||||||||||||||||
|
|
||||||||||||
















