Dev Stats Scraper

Discussion in 'Hobbies' started by Funky Biskit, Jan 8, 2017.

  1. Funky Biskit

    Funky Biskit Member

    So I really like numbers. The !stats command always interested me because it's cool to see all of your collective playtime and influence represented by numbers. However, there's a limitation. As you may know, we mere members can't look at other players stats. I've solved that. I wrote a webscraper python script that gathers data from the stats page for any player whose Steam ID you have (easily retrievable with "status" command in-game). Keep in mind that there are dependencies, use pip to install lxml and requests then you should be good to go. There are plenty more bits of data to be scraped, so go ahead and improve the script if you feel so inclined. Let me know if you find any bugs!

    Screenshot of the tool in action:
    [​IMG]

    To the lovely staff, I don't believe this is necessarily a harmful tool and I hope that you aren't upset by me releasing it. If anything, you can read the code and patch my retrieval method if you don't want this kind of functionality.


    Code:
    #SeriousData Webscraper by Funky Biskit
    #Last Updated January 8th, 2017
    
    import os
    import math
    import re
    from lxml import html
    import requests
    
    os.system('cls')
    
    steamid = input("Enter a Steam ID here: ")
    
    if(steamid != ''):
    
        url = 'http://www.seriousgmod.com/stats/stats.php?steamid=' + steamid
    
        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows; Valve Source Client) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1003.1 Safari/535.19 Awesomium/1.7.5.1 GMod/13',
        }
        response = requests.get(url, headers=headers)
        tree = html.fromstring(response.content)
      
        def playTimes(stat):
            servers = tree.xpath('//div[@class="serverbox"]/h3[1]/text()')
            playtimes = tree.xpath('//div[@class="serverbox"]/h5[1]/text()[preceding-sibling::br]')
          
            if(stat == "times"):
                for x in range(0,len(playtimes)):
                    print(servers[x] + ": " + str(playtimes[x]))
                  
            elif(stat == "fav"):
                toptime = 0.0
                favindex = 0
                for x in range(0,len(playtimes)):
                    playtimes[x] = re.sub("[^0-9 ]", "", playtimes[x])
                    sep = playtimes[x].split()
                    if(len(sep)==2):
                        playtimes[x] = float(sep[0]) + round(int(sep[1])/60,2)
                    else:
                        playtimes[x] = round(int(sep[0])/60,2)
                  
                    if(toptime<playtimes[x]):
                        toptime = playtimes[x]
                        favindex = x
                return servers[favindex]
    
        screenname = tree.xpath('//div[@class="w-section main-section"]/h1[1]/text()')
      
        if(screenname):
            totalhours = tree.xpath('//div[@class="w-row totals"]/div[1]/h4/strong/text()')
            kills = tree.xpath('//div[@class="w-row totals"]/div[2]/h4/strong/text()')
            deaths = tree.xpath('//div[@class="w-row totals"]/div[3]/h4/strong/text()')
            headshots = tree.xpath('//div[@class="w-row totals"]/div[4]/h4/strong/text()')
          
            kills[0] = kills[0].replace(',','')
            deaths[0] = deaths[0].replace(',','')
            headshots[0] = headshots[0].replace(',','')
          
            print("Name: "+screenname[0])
            print("Total Hours: "+totalhours[0])
            print("Total Kills: "+kills[0])
            print("Total Deaths: "+deaths[0])
            print("Total Headshots: "+headshots[0])
            print("Career K/D Ratio: "+str(round(int(kills[0])/int(deaths[0]),2)))
            print("Career Headshot Ratio: "+str(round(int(headshots[0])/int(kills[0]),2)))
            print("Favorite Server: "+playTimes("fav"))
            print("\n--- PLAYTIMES ---\n")
            playTimes("times")
        else:
            print("No records found.")
    else:
        print("No Steam ID given.")
    
     
    • Like Like x 3
    • Winner Winner x 2
  2. Togo ✿

    Togo ✿ Nobody Gets it VIP Silver

    cool
     
    • Agree Agree x 2
    • Like Like x 1
  3. PixeL

    PixeL Man märker andras fel och glömmer sina egna Banned VIP Silver

    Nice job on this. I like.
     
    • Friendly Friendly x 1
  4. Paradox

    Paradox The One Eyed Ghoul Banned Elite

    Not much use atleast for me, as you can ask for their stats and they can give them if they feel like it but nice!
     
    • Agree Agree x 1
  5. Chai

    Chai returned; VIP

    A step closer to dynamic signature.
     
    • Agree Agree x 2
  6. Skyrossm

    Skyrossm Ideal Female Moderator? VIP Emerald Bronze

    I think @My Dime Is Up did this but won't share :c
     
    • Like Like x 1
  7. Funky Biskit

    Funky Biskit Member

    I'm working on this as we speak and will make code available publicly :)
     
    • Like Like x 2
  8. Chai

    Chai returned; VIP

    Well nope he didnt.
    What he shared previously isnt consider as a proper dynamic signature but rather taking screeenshots on the website and involve zero scrapping techniques.
     
    • Agree Agree x 1
    • Informative Informative x 1
  9. Funky Biskit

    Funky Biskit Member

    Done! Check my signature. I think I'll release it on Github but also offer a service where community members can pay me in server points (Gotta buy my T rounds somehow!) to have it hosted on my server with customization options :)
     
    • Like Like x 2
  10. Chai

    Chai returned; VIP

    Very nice!

    I might disrupt your business :)
     
    • Agree Agree x 1
  11. Funky Biskit

    Funky Biskit Member

    • Winner Winner x 3
  12. Funky Biskit

    Funky Biskit Member

    Found that non-latin characters cause errors on Windows and I didn't take into account that some people haven't played on certain servers, which also causes errors.

    Updated code here:
    Code:
    #SeriousData Webscraper by Funky Biskit
    #Last Updated January 9th, 2017
    
    import os
    import math
    import re
    import unicodedata
    from lxml import html
    import requests
    
    os.system('cls')
    
    steamid = input("Enter a Steam ID here: ")
    
    if(steamid != ''):
    
        url = 'http://www.seriousgmod.com/stats/stats.php?steamid=' + steamid
    
        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows; Valve Source Client) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1003.1 Safari/535.19 Awesomium/1.7.5.1 GMod/13',
        }
        response = requests.get(url, headers=headers)
        tree = html.fromstring(response.content)
       
        def playTimes(stat):
            servers = tree.xpath('//div[@class="serverbox"]/h3[1]/text()')
            playtimes = tree.xpath('//div[@class="serverbox"]/h5[1]/text()[preceding-sibling::br]')
           
            if(stat == "times"):
                for x in range(0,len(playtimes)):
                    print(servers[x] + ": " + str(playtimes[x]))
                   
            elif(stat == "fav"):
                toptime = 0.0
                favindex = 0
                for x in range(0,len(playtimes)):
                    if(playtimes[x] != "N/A"):
                        playtimes[x] = re.sub("[^0-9 ]", "", playtimes[x])
                        sep = playtimes[x].split()
                        if(len(sep)==2):
                            playtimes[x] = float(sep[0]) + round(int(sep[1])/60,2)
                        else:
                            playtimes[x] = round(int(sep[0])/60,2)
                       
                        if(toptime<playtimes[x]):
                            toptime = playtimes[x]
                            favindex = x
                return servers[favindex]
    
               
        def gracefully_degrade_to_ascii(text):
            return unicodedata.normalize('NFKD',text).encode('ascii','ignore')
       
        screenname = tree.xpath('//div[@class="w-section main-section"]/h1[1]/text()')
        screenname[0] = str(gracefully_degrade_to_ascii(screenname[0]))[2:-1]
       
        if(screenname):
            totalhours = tree.xpath('//div[@class="w-row totals"]/div[1]/h4/strong/text()')
            kills = tree.xpath('//div[@class="w-row totals"]/div[2]/h4/strong/text()')
            deaths = tree.xpath('//div[@class="w-row totals"]/div[3]/h4/strong/text()')
            headshots = tree.xpath('//div[@class="w-row totals"]/div[4]/h4/strong/text()')
           
            kills[0] = kills[0].replace(',','')
            deaths[0] = deaths[0].replace(',','')
            headshots[0] = headshots[0].replace(',','')
           
            print("Name: "+screenname[0])
            print("Total Hours: "+totalhours[0])
            print("Total Kills: "+kills[0])
            print("Total Deaths: "+deaths[0])
            print("Total Headshots: "+headshots[0])
            print("Career K/D Ratio: "+str(round(int(kills[0])/int(deaths[0]),2)))
            print("Career Headshot Ratio: "+str(round(int(headshots[0])/int(kills[0]),2)))
            print("Favorite Server: "+playTimes("fav"))
            print("\n--- PLAYTIMES ---\n")
            playTimes("times")
        else:
            print("No records found.")
    else:
        print("No Steam ID given.")