public static void main(String[] args) {
// TODO Auto-generated method stub
String CurrentLine="";
String TotalString="";
InputStream urlStream;
try {
URL url = new URL("http://www.163.com");
HttpURLConnection connection = (HttpURLConnection)url.openConnection();
connection.connect();
urlStream = connection.getInputStream();
BufferedReader reader = new BufferedReader(
new InputStreamReader(urlStream,"gbk"));
while ((CurrentLine = reader.readLine()) != null) {
TotalString += CurrentLine+" ";
}
String content = TotalString;
System.out.println(content);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
// TODO Auto-generated method stub
String CurrentLine="";
String TotalString="";
InputStream urlStream;
try {
URL url = new URL("http://www.163.com");
HttpURLConnection connection = (HttpURLConnection)url.openConnection();
connection.connect();
urlStream = connection.getInputStream();
BufferedReader reader = new BufferedReader(
new InputStreamReader(urlStream,"gbk"));
while ((CurrentLine = reader.readLine()) != null) {
TotalString += CurrentLine+" ";
}
String content = TotalString;
System.out.println(content);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
本文展示了一个使用Java从指定URL抓取网页内容的示例。通过创建URL对象并使用HttpURLConnection进行连接,随后利用BufferedReader读取网页的GBK编码内容,并将其打印出来。此示例适用于初学者了解如何用Java进行基本的网络爬虫开发。
4425

被折叠的 条评论
为什么被折叠?



