0

I execute crawling for collect data. everything is well without pagination. below code is problem code. i need your help.

when i operate code -> maybe 30sec later appear error message

ReadTimeoutError: HTTPConnectionPool(host='localhost', port=36529): Read timed out. (read timeout=120) Blockquote

below code is my pagination code. please fix wrong code

if c == len(cnt_accident) - 1:
            # 페이지 이동 버튼은 '<<', '<', '1~10개의 숫자버튼', '>', '>>' 총 12개의 버튼 존재
            # 숫자 r의 범위는 3 ~ 12 이므로 12까지는 숫자 버튼을 직접 입력
            if r < 13:
                # time.sleep(1)
                r += 1
                # btn_page = driver.find_element(By.CSS_SELECTOR, f'#main > div.contentsRow > div.col-md-12.col-xs-12.col-sm-12.t-center.pad0 > div > ul > li:nth-child({r}) > a')
                btn_page = driver.find_element(By.XPATH, f'//*[@id="main"]/div[2]/div[2]/div/ul/li[4]/a')
                time.sleep(30)
                btn_page.click()
                # driver.execute_script("arguments[0].click();", btn_page)
                print('yes, i have button!')
                time.sleep(3)
            # r이 13일 경우는 숫자 '>'을 클릭
            elif r == 13:
                try:
                    btn_page = driver.find_element(By.XPATH, '//*[@id="main"]/div[2]/div[2]/div/ul/li[13]/a')
                    driver.execute_script('arguments[0].click();', btn_page)
                    print('yes, i have button!')
                    time.sleep(3)
                    r = 3
                except NoSuchElementException:
                    print(f"페이지 버튼 {r}을 찾을 수 없습니다.")
                    break
    # 예외사항 처리
    except Exception as e:
        print('콘텐츠 값이 없습니다.', e)
        print(c + 1)
        driver.back()
        if c == len(cnt_accident) - 1:
                # 페이지 이동 버튼은 '<<', '<', '1~10개의 숫자버튼', '>', '>>' 총 12개의 버튼 존재
                # 숫자 r의 범위는 3 ~ 12 이므로 12까지는 숫자 버튼을 직접 입력
            if r < 13:
                btn_page = driver.find_element(By.XPATH, f'//*[@id="main"]/div[2]/div[2]/div/ul/li[{r}]/a')
                driver.execute_script("arguments[0].click();", btn_page)
                print('yes, i have button!')
                r += 1
                time.sleep(2)
            # r이 13일 경우는 숫자 '>'을 클릭
            elif r == 13:
                btn_page = driver.find_element(By.XPATH, '//*[@id="main"]/div[2]/div[2]/div/ul/li[13]/a')
                driver.execute_script('arguments[0].click();', btn_page)
                time.sleep(2)
3
  • There is likely a much easier way to write this code but I can't help without the URL and some basic instructions on what you're trying to accomplish.
    – JeffC
    Commented Jan 3 at 3:37
  • Which website are you scraping? What, exactly, are you trying to achieve?
    – SIGHUP
    Commented Jan 3 at 9:24
  • that's a read timeout from the driver's server. Sounds like you are running out of resources or something crashed. You doing anything with threads/concurrency? Commented Jan 3 at 17:59

0