golang下载妹子图

想用golang下载妹子图吗?点进来看看吧!很方便

闲来无趣,就想着看用golang来做点什么事情,这不,到处都是python下载妹子图,我就想着用golang来弄一个下载妹子图的简单小工具

简单分析页面结构

访问https://www.meizitu.com 页面后, 进入到详情页面,可以看到url变为https://www.meizitu.com/a/5511.html ,我们将url中的数字任意修改,发现都能访问,那么,我们暂且就通过手动输入页面索引的方式来访问页面。
我们再看看页面结构,通过google浏览器的开发者工具很容易就可以看到:
Elment

下载图片

刚才我们已经大致的分析了页面结构,接下来 ,我们就开始简单的实现图片下载的功能,在这里我们选用了:
colly:用于采集页面和图片
uuid:生产UUID
cli:命令行工具包
1、我们先初始化一个队列,用于存放需要访问的url

1
2
3
4
var q, _ = queue.New(
2, // Number of consumer threads
&queue.InMemoryQueueStorage{MaxSize: 10000}, // Use default queue storage
)

2、初始化需要下载的页面,我用命令行的方式来决定起始页

1
2
3
4
5
6
7
8
9
10
11
12
if args := context.Args(); len(args) > 0 {
return fmt.Errorf("invalid command: %q", args.Get(0))
}
start := context.Int("s")
log.Println("起始页:", start)
end := context.Int("e")
log.Println("截止页:", end)

for i := start; i <= end; i++ {
url := fmt.Sprintf("https://www.meizitu.com/a/%d.html", i)
q.AddURL(url)
}

3、初始化一个colly来处理队列中的url,我在OnHTML中,查找页面的postContent>img的dom节点,获取图片的路径,又放在队列中

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
c := colly.NewCollector()

c.UserAgent = "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36"
c.OnHTML(".postContent", func(e *colly.HTMLElement) {
//e.Request.Visit(e.Attr("href"))
e.ForEach("img", func(i int, element *colly.HTMLElement) {
//e.Request.Visit(element.Attr("src"))
q.AddURL(element.Attr("src"))
})
})
c.OnResponse(func(resp *colly.Response) {
if strings.Contains(resp.Headers.Get("Content-Type"), "image/jpeg") {
download(resp.Body)
}
})

c.OnRequest(func(r *colly.Request) {
fmt.Println("Visiting", r.URL)
})
q.Run(c)

最后的效果:



代码很简单,如果需要可以去https://github.com/eyiadmin/meizitu 看看,如果想下载图片,就直接下载exe文件。在命令行输入main -s 1 -e 100即可下载

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
main -s 100 -e 108
2019/10/07 14:26:42 起始页: 100
2019/10/07 14:26:43 截止页: 108
Visiting https://www.meizitu.com/a/100.html
Visiting https://www.meizitu.com/a/101.html
Visiting https://www.meizitu.com/a/102.html
Visiting https://www.meizitu.com/a/103.html
Visiting https://www.meizitu.com/a/104.html
Visiting https://www.meizitu.com/a/105.html
Visiting https://www.meizitu.com/a/106.html
Visiting https://www.meizitu.com/a/107.html
Visiting https://www.meizitu.com/a/108.html
Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/29/01.jpg
Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/29/02.jpg
Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/29/03.jpg
Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/29/04.jpg
Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/29/05.jpg
Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/29/06.jpg
Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/30/01.jpg
Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/29/07.jpg
Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/29/08.jpg
Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/29/09.jpg
Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/30/02.jpg
Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/30/03.jpg
Visiting http://pic.topmeizi.com/wp-content/uploads/2012a/01/30/04.jpg

作者

eyiadmin

发布于

2019-10-24

更新于

2024-05-31

许可协议

评论