小能豆

Login in to ASP Net site with Python Requests

python

I’m trying to scrape a page for data, but their login process has me stumped. As it’s a ASP Net site, my searches has me including __VIEWSTATE and __VIEWSTATEGENERATOR, but I cannot find __EVENTTARGET or __EVENTVALIDATION, not sure if they can be missing sometimes.

The Website login page has this form (Personal data get’s prefilled, so * those):

<form method="get" action="./login.aspx" id="validateSubmitForm" autocomplete="off" novalidate="">
<div class="aspNetHidden">
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="*****long viewstate here****" />
</div>

<div class="aspNetHidden">

    <input type="hidden" name="__VIEWSTATEGENERATOR" id="__VIEWSTATEGENERATOR" value="******" />
</div>

                      <div class="row">
                        <div class="form-group col-md-12 mb-4">
                            <!--
                          <input type="email" class="form-control input-lg" id="email" aria-describedby="emailHelp"
                            placeholder="email"> -->
                            <input name="TextBox1N" type="text" value="*******" id="TextBox1N" title="Username" class="form-control input-lg" placeholder="Username" />
                        </div>
                        <div class="form-group col-md-12 ">
                            <!--
                          <input type="password" class="form-control input-lg" id="password" placeholder="Password">
                            -->
                            <input name="TextBox2N" type="password" id="TextBox2N" class="form-control input-lg" placeholder="Password" value="******" />
                        </div>
                        <div class="form-group col-md-12 ">

                        </div>
                        <div class="col-md-12">

                          <div class="d-flex justify-content-between mb-3">

                            <div class="custom-control custom-checkbox mr-3 mb-3">
                                <!--
                              <input type="checkbox" class="custom-control-input" id="customCheck2">
                              <label class="custom-control-label" for="customCheck2">Remember me</label>
                                -->
                                <input id="CheckBox1N" type="checkbox" name="CheckBox1N" checked="checked" />
                                <span id="remember_meN" for="CheckBox1N">Remember me</span>
                            </div>

                            <a class="text-color" href="remember.aspx"> Remember </a>

                          </div>
                            <!--
                          <button type="submit" class="btn btn-primary btn-pill mb-4" style="width:100% !important">Sign In</button>
                            -->
                            <input type="submit" name="Button1N" value="Sign in" id="Button1N" class="btn btn-primary btn-pill mb-4" style="width:100% !important" />
                          <p>
                              Don't have an account yet ?

                            <a class="text-blue" href="registrati.aspx"> Sign</a>
                              <!--
                              <input type="submit" name="Button2N" value="Sign up" id="Button2N" class="text-blue" />
                                      -->
                          </p>
                        </div>
                      </div>
                    </form>

What I’ve cobbled together so far is (url and login info masked):

from bs4 import BeautifulSoup
import requests

#Session Setup
s = requests.Session()
s.headers.update({"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"})
uName='******'
pwd ='******' 

#Load page
url='http://***/login.aspx'
r = s.get(url)
soup = BeautifulSoup(r.text, 'html.parser')

#Set params
paramsPost = {"TextBox1N": uName,
              "TextBox2N": pwd,
              "CheckBox1N": "on",
              "Button1N": "Sign in"
             }

#Add __VIEWSTATE params
paramsPost['__VIEWSTATE'] = soup.find('input', id='__VIEWSTATE')['value']
paramsPost['__VIEWSTATEGENERATOR'] = soup.find('input', id='__VIEWSTATEGENERATOR')['value']

#Login to a GET form
req = requests.Request('GET', url, data=paramsPost)
prep = req.prepare()
pUrl = prep.url+'?'+prep.body #this was mostly done so I could print the full url and verify against a browser generated one
r = s.get(url)

For posterity I have also tried the following:

r = s.post(url, data=paramsPost)
print(r.url)

Both ways just send me to the ./error.aspx page.

Logging in with a browser and inspecting the network shows a GET request was made, __VIEWSTATE, __VIEWSTATEGENERATOR, TextBox1N, TextBox2N, CheckBox1N and Button1N was added to the Request URL. Status 302 returned and then redirected to ./dashboardAssets.aspx

Interestingly, __VIEWSTATE my code returns is shorter than the __VIEWSTATE my browser returns. Is this related?

Everything I Google or Search on SO points to __EVENT params, but I can’t locate them, so not sure this site needs them.

Any other ideas I can try?


阅读 79

收藏
2023-12-22

共1个答案

小能豆

Based on the HTML form you provided, it seems that the form is submitted using a GET request, and the parameters are added to the URL. However, in your code, you are using requests.Request with a GET method, but then you are making a GET request using s.get(url) separately. You should be using s.get(pUrl) to include the parameters in the URL.

Here’s a modified version of your code:

from bs4 import BeautifulSoup
import requests

# Session Setup
s = requests.Session()
s.headers.update({"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"})
uName = '******'
pwd = '******'

# Load page
url = 'http://***/login.aspx'
r = s.get(url)
soup = BeautifulSoup(r.text, 'html.parser')

# Set params
paramsPost = {"TextBox1N": uName,
              "TextBox2N": pwd,
              "CheckBox1N": "on",
              "Button1N": "Sign in"
              }

# Add __VIEWSTATE params
paramsPost['__VIEWSTATE'] = soup.find('input', id='__VIEWSTATE')['value']
paramsPost['__VIEWSTATEGENERATOR'] = soup.find('input', id='__VIEWSTATEGENERATOR')['value']

# Prepare the GET request URL
req = requests.Request('GET', url, params=paramsPost)
prep = req.prepare()
pUrl = prep.url

# Make the GET request
r = s.get(pUrl)

# Print the final URL and check the response
print(pUrl)
print(r.url)
print(r.text)

This should mimic the behavior of the browser by appending the parameters to the URL. Make sure to check the response (r.text) for any error messages or clues about why the login is not successful. Also, note that there might be additional JavaScript-based actions or headers required for successful login in more complex scenarios.

2023-12-22