Early search engines used an inverted index which I covered in my tf-idf post. That approach is vulnerable to term spam. People would put invisible text on their pages (same color as the background) to trick web crawlers into thinking that the page was about a topic it wasn't. The PageRank algorithm solves that problem by giving more importance to pages that have many in-links, and especially important in-links. Even if someone creates millions of fake pages that link to their real page, it's importance will not be that high because the fake pages will have a low importance as no one is linking to them.
The web can be thought of as a directed graph where web pages are the vertexes and links are the arcs. The basic idea of PageRank is that you simulate web surfers starting at random spots in the web graph and then following random out-links. The most important pages, the pages with the most in-links, will end up with the most surfers.
To implement this simulation, consider a transition matrix M which has n rows and columns where n is the number of pages crawled. Mij has the value 1/k if page j has k arcs out and one of them is to page i, otherwise it has the value 0. Start with a vector v0 which has all elements set to 1/n. The first step is to multiply v0 by M, the second step is to multiple by M2, and so on (this is an example of a Markov process which means you can make predictions on its future state based only on the present state and do not need the history).
If the web graph is one strongly connected component and doesn't have any dead ends, then v = Mv is the limiting distribution. The limit would be reached when multiplying the distribution by M another time doesn't change the distribution. Vector v is an eigenvector of M because v = λMv. Vector v is also the principal eigenvector because M is a stochastic matrix and the eigenvalue associated with the principal eigenvector is 1. In practice it takes 50-75 iterations to reach this limit.
M is too big for us to use Gaussian elimination. M is also very sparse, so it makes sense to store only the non-zero elements. An entire column can be represented by the out-degree of a page and the row number of the non-zero elements (since those element's values will be 1 divided by the out-degree). That means we only need to store a little over 4 bytes for each non-zero element. Then we can solve for v using MapReduce and v tells us the "page rank" of each page.
In reality the web is not strongly connected. There are dead ends and there are spider traps, or cycles, in the graph. This problem is solved by the concept of taxation. Each surfer has a small probability of teleporting to a random page instead of following an out-link. There are also other algorithm variations for dealing with link spam. For example, besides page content, you also consider the link text or words near the link, so you are getting other people's take on what the page is about instead of relying solely on the page owner.
UPDATE:
This same idea can be applied to sports ratings where links are replaced by something like goals scored. My Machine Learning for NCAA Basketball Prediction - Performance Edition post has more details and some code.