[关闭]
@songying 2018-08-27T17:50:48.000000Z 字数 4805 阅读 2065

维护项目分析

python爬虫


  1. # 在.zshrc文件中添加如下内容:
  2. export QB=root,10.127.25.14,22
  3. alias sshjump="ssh songyingxin_sx@jumpbox.qiyi.domain -o SendEnv=QB"

注意

你将 UGCRobot/UGCRobot/__init__.py下的部分代码注释掉了。

56

tudou


  • 失败网址: http://video.tudou.com/v/XMjAzNDUxMDYwOA==.html
  • 失败原因: 网址为旧网址,徐跳转为新的网址,且需要设置相对应的参数
  • 解决办法: 重新编写tudou.py, 需要解决相关参数问题
  • json文件地址: ups.youku.com
  • header:
  1. User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.84 Safari/537.36
  2. Referer: http://video.tudou.com/v/XMjAzNDUxMDYwOA==.html

http://ups.youku.com/ups/get.json?vid=508627652&ccode=050F&client_ip=192.168.1.1&utid=KAzfE6LzUAcCAcpsDvDsMj4%2F&client_ts=1534729426&

ckey=110%23KLkkAUkfkrai3FGg2wgAMuy2kMUOcYlOmkm2hQ7%2F8DBLkDUt21J%2FjnsoufZ881GQaUI2hn0yJHbHgBZxktOWjqjk0zFL8MzQfwbrGTUt5OJf5JdByUpgRGIvJZ2%2BxWRIuJ3%2BsAkwsBQhGq88j9cwkP5ysLgwjT444yjOmHvmb2ckBi%2BtjAZiYbJhs9bws9T4kcjQsBsisR449LdAltC3PEomZTzXs9g7krSj9BjM2pUkDoEo9%2FWxN9%2FDWUjfG5Lx90KM7agUGg7vKFDdiywugn7snRQjwXT%2BbR04JEZNj09g%2FTvu2wFtNq1BWM7b7AiR%2B8ToLggsHi%2BrG6kchUFjqPrEMa9LSET%2BcM9Dgrc%2BnjD4nD8I2B%2BbEKLbxyREtmxHV1R56TXr5ylby9UqZ599hqH31w9YbiF%2B2%2FnXDOdW7LoB34gZiH4uL9OHbbJAZTF0XLzeBM2c&site=-1&wintype=interior&p=1&fu=0&vs=1.0&rst=mp4&dq=flv&os=win&osv=&d=0&bt=pc&aw=w

mgtv

https://pstream.api.mgtv.com/player/getSource?video_id=4507688&
did=c477fa3e-84d5-4086-9bec-a29a61bce900&suuid=e2c487af-2387-4f68-8a14-f49be542407a
&collection_id=320516&pm2=nV~_UltXq93XzgAa30FIUGqnBpMtiDYTfpR0Jutxr3nlPAQ8q5f35WPo9ha4paNU3kmjLiaC_B29L0ahNYXhkqEi61_ERZ2gTBxC97wcXcooDFpNBN1ecFVn59aK7_dBHSQRIXRo3IiRV5tE9z8_V58DR5pyFAiEIDtaShWKJHsHNhSmwJxgJrrYSSHkIUxdA8AkXosKccH~JmCr8QeJQSbokzdhBhRdk~opAZHU8pST2Z1pKix17FPSALMDQLm4tgS2qUzyVvHoZ1VIX0T45kOHxVjre8_q9YOKAuRSe2aDyhzcYbOLhrVBpPntmpyCtoxYvqTYees-&tk2=wATOlNmYxYTY5ITYtMWZilTL2gDM00SNkRDOtU2MhZ2N3QzY9QWakxHMwATM98mbwxXMwADMuMjLw0jclZHfzQDNxADN0MTNx0Ddpx2Y&_support=10000000

4. pptv

5. sohu

6. 头条

  1. 2018-08-08 16:36:58 - VideoMetaUtil.getVideoIntroduction.39 - ERROR : Traceback (most recent call last):
  2. File "/opt/vtc/UGCRobot/resolution/VideoMetaUtil.py", line 33, in getVideoIntroduction
  3. video_dict = parser.parse()
  4. File "/opt/vtc/UGCRobot/meta/meta_resolution.py", line 219, in parse
  5. title = match.group(1)
  6. AttributeError: 'NoneType' object has no attribute 'group'

7. 百度

  1. 2018-08-08 16:36:17 - VideoMetaUtil.getVideoIntroduction.39 - ERROR : Traceback (most recent call last):
  2. File "/opt/vtc/UGCRobot/resolution/VideoMetaUtil.py", line 32, in getVideoIntroduction
  3. parser = parser_cls(url, logger)
  4. File "/opt/vtc/UGCRobot/meta/meta_resolution.py", line 23, in __init__
  5. raise Exception("meta parser exception!")
  6. Exception: meta parser exception!

问题2: 上传失败, 今日超出上传上限

8. meipai

  1. 部分链接为文章而非视频如:http://www.meipai.com/media/1031259795
  2. 链接失效, 如: http://service.vtc.qiyi.domain:6100/swiftlog?dc=bj&logtype=UGC-robot&date=20180801&logname=5b610e8bb302c333fcb077e2

9. qq: 腾讯

  1. 下载视频没有问题, 上传出错,如: http://service.vtc.qiyi.domain:6100/swiftlog?dc=jy&logtype=UGC-robot&date=20180803&logname=5b63c853b302c333fd6b2b7b
  2. 视频链接失效,如: https://v.qq.com/x/page/m0743mbmgc1.html
  3. vip视频无法下载
添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注