Next, get a list of random people to make up the dataset. Fortunately, Hot or Not provides an API call that returns a list of people with specified criteria. In this exam-
ple, the only criteria will be that the people have “meet me” profiles, since only from
these profiles can you get other information like location and interests. Add this
function to hotornot.py: (查看原文)
What Does This Have to Do with the Articles Matrix?
So far, what you have is a matrix of articles with word counts. The goal is to factorize
this matrix, which means finding two smaller matrices that can be multiplied
together to reconstruct this one. The two smaller matrices are:
The features matrix
This matrix has a row for each feature and a column for each word. The values
indicate how important a word is to a feature. Each feature should represent a
theme that emerged from a set of articles, so you might expect an article about a
new TV show to have a high weight for the word “television.”
The weights matrix
This matrix maps the features to the articles matrix. Each row is an article and
each column is a feature. The values state how much each feature applies to each
articl... (查看原文)
2 有用 Earthson 2012-12-16 17:09:06
很“基础”的书,或者说很应用的书。可以用来快速了解领域概况,严格来说可能连基础都算不上,只能说是入门。
0 有用 巳炷销 2009-12-18 22:10:01
指条明路给我
3 有用 汪杨 2017-06-16 20:53:54
很实用,内容有些过时
0 有用 睡睡睡 2011-11-23 11:50:45
很实用
0 有用 阅微草堂 2016-02-05 11:43:49
失控中的蜂群,蜂群不是民主,而是一种局部随机整体最优的模型;皮尔逊距离是整体测量。