Is ardán aitheanta ar líne é eBay a thairgeann deiseanna trádála i raon leathan táirgí i measc a chuid úsáideoirí cláraithe. Sa treoir seo, míneoimid conas sonraí a scrape ó liostú eBay ag baint úsáide as Python. Mar sin, beidh spéis againn i sonraí atá ar fáil ón liostáil féin chomh maith le dul go dtí gach ceann de na táirgí ar a seal chun níos mó sonraí a fháil.
Chun tús a chur leis, déan cinnte go bhfuil na leabharlanna Python seo a leanas suiteáilte agat:
Suiteáil na leabharlanna seo ag úsáid:
pip install requests lxml pandas
Agus tú ag cuardach táirgí ar eBay, is féidir gach URL leathanaigh a mhodhnú chun nascleanúint a dhéanamh trí thorthaí tuisceana. Mar shampla:
Úsáidtear an paraiméadar _PGN chun nascleanúint a dhéanamh trí leathanaigh éagsúla de liostaí, rud a chuireann ar chumas aisghabháil sonraí fairsinge. Cuirfimid tús leis an bpróiseas scríobtha.
Chun tús a chur leis, cuirfimid ceanntásca ar bun chun aithris a dhéanamh ar iarratas fíor-bhrabhsálaí, rud a chabhraíonn le blocáil agus le blocáil fhéideartha a sheachaint ag bearta frith-bot Ebay. Ansin seolfaimid iarratas chuig an leathanach liostála chun na naisc do gach táirge a bhailiú.
import requests
from lxml.html import fromstring
# Sainmhínigh ceanntásca chun fíor -bhrabhsálaí a ionsamhlú
headers = {
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
'accept-language': 'en-IN,en;q=0.9',
'cache-control': 'no-cache',
'dnt': '1',
'pragma': 'no-cache',
'priority': 'u=0, i',
'sec-ch-ua': '"Google Chrome";v="129", "Not=A?Brand";v="8", "Chromium";v="129"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Linux"',
'sec-fetch-dest': 'document',
'sec-fetch-mode': 'navigate',
'sec-fetch-site': 'none',
'sec-fetch-user': '?1',
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36',
}
# Iarr ar pharaiméadair don cheist cuardaigh
params = {
'_nkw': 'laptop',
}
# Seol iarratas chuig an leathanach liostála eBay
listing_page_response = requests.get('https link', params=params, headers=headers)
listing_parser = fromstring(listing_page_response.text)
Ar an leathanach liostála, bainfimid na URLanna le haghaidh táirgí aonair. Ligeann sé seo dúinn cuairt a thabhairt ar gach leathanach táirge chun sonraí sonracha a bhailiú, mar shampla teideal an táirge, praghas, agus níos mó.
# Parse an leathanach liostála chun naisc táirge a bhaint amach
links = listing_parser.xpath('//div[@class="s-item__info clearfix"]/a[@_sp="p2351460.m1686.l7400"]/@href')
# Aschur sampla de na naisc a aimsíodh
print("Product Links:", links[:5]) # Display the first five product links
Leis na URLanna táirge ar láimh, tabharfaimid cuairt ar gach leathanach táirge agus bainfimid na sonraí seo a leanas amach:
Ansin, déanfaimid lúb trí gach nasc agus bainfimid úsáid as abairtí XPath chun an fhaisnéis riachtanach a aimsiú ar an leathanach táirge.
product_data = []
for url in links:
#Seol iarratas chuig an leathanach táirge
product_page_response = requests.get(url, headers=headers)
product_parser = fromstring(product_page_response.text)
# Sonraí a bhaint as XPath ag baint úsáide as XPath
try:
product_title = product_parser.xpath('//h1[@class="x-item-title__mainTitle"]/span/text()')[0]
price = product_parser.xpath('//div[@data-testid="x-price-primary"]/span/text()')[0]
shipping_cost = product_parser.xpath('//div[@class="ux-labels-values col-12 ux-labels-values--shipping"]//div[@class="ux-labels-values__values-content"]/div/span/text()')[0]
product_condition = product_parser.xpath('//div[@class="x-item-condition-text"]/div/span/span[2]/text()')[0]
available_quantity = product_parser.xpath('//div[@class="x-quantity__availability"]/span/text()')[0]
sold_quantity = product_parser.xpath('//div[@class="x-quantity__availability"]/span/text()')[1]
payment_options = ', '.join(product_parser.xpath('//div[@class="ux-labels-values col-12 ux-labels-values__column-last-row ux-labels-values--payments"]/div[2]/div/div//span/@aria-label'))
return_policy = product_parser.xpath('//div[@class="ux-layout-section ux-layout-section--returns"]//div[@class="ux-labels-values__values-content"]/div/span/text()')[0]
# Sonraí a stóráil i bhfoclóir
product_info = {
'Title': product_title,
'Price': price,
'Shipping Cost': shipping_cost,
'Condition': product_condition,
'Available Quantity': available_quantity,
'Sold Quantity': sold_quantity,
'Payment Options': payment_options,
'Return Policy': return_policy,
}
product_data.append(product_info)
except IndexError as e:
print(f"An error occurred: {e}")
Tar éis na sonraí a bhailiú, is féidir linn é a shábháil i gcomhad CSV ag baint úsáide as pandas.
import pandas as pd
# Sonraí a thiontú go DataFrame
df = pd.DataFrame(product_data)
# Sábháil le CSV
df.to_csv('ebay_product_data.csv', index=False)
print("Data saved to ebay_product_data.csv")
Fostaíonn eBay teorainn le rátaí chun iarratais iomarcacha a chosc. Seo roinnt modhanna chun braite a sheachaint:
Trí na dea -chleachtais seo a leanúint, is féidir leat an baol go gcuirfear bac ar an mbaol a íoslaghdú agus leanúint ar aghaidh ag scríobadh sonraí go héifeachtach.
Seo an cód iomlán chun sonraí eBay a scríobadh agus é a shábháil ar chomhad CSV:
import requests
import random
from lxml.html import fromstring
import pandas as pd
useragents = ['Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36',
'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36',
'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36']
# Sainmhínigh ceanntásca le haghaidh iarratais
headers = {
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
'accept-language': 'en-IN,en;q=0.9',
'cache-control': 'no-cache',
'dnt': '1',
'pragma': 'no-cache',
'priority': 'u=0, i',
'sec-ch-ua': '"Google Chrome";v="129", "Not=A?Brand";v="8", "Chromium";v="129"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Linux"',
'sec-fetch-dest': 'document',
'sec-fetch-mode': 'navigate',
'sec-fetch-site': 'none',
'sec-fetch-user': '?1',
'upgrade-insecure-requests': '1',
'user-agent': random.choice(useragents),
}
# Paraiméadair Iarratas Cuardaigh
params = {'_nkw': 'laptop'}
proxies = {
'http': 'IP:PORT',
'https': 'IP:PORT'
}
# Faigh an leathanach liostála
listing_page_response = requests.get('https://www.ebay.com/sch/i.html', params=params, headers=headers, proxies=proxies)
listing_parser = fromstring(listing_page_response.text)
links = listing_parser.xpath('//div[@class="s-item__info clearfix"]/a[@_sp="p2351460.m1686.l7400"]/@href')
# Sonraí Táirgí a Shliocht
product_data = []
for url in links:
product_page_response = requests.get(url, headers=headers, proxies=proxies)
product_parser = fromstring(product_page_response.text)
try:
product_info = {
'Title': product_parser.xpath('//h1[@class="x-item-title__mainTitle"]/span/text()')[0],
'Price': product_parser.xpath('//div[@data-testid="x-price-primary"]/span/text()')[0],
'Shipping Cost': product_parser.xpath('//div[@class="ux-labels-values col-12 ux-labels-values--shipping"]//div[@class="ux-labels-values__values-content"]/div/span/text()')[0],
'Condition': product_parser.xpath('//div[@class="x-item-condition-text"]/div/span/span[2]/text()')[0],
'Available Quantity': product_parser.xpath('//div[@class="x-quantity__availability"]/span/text()')[0],
'Sold Quantity': product_parser.xpath('//div[@class="x-quantity__availability"]/span/text()')[1],
'Payment Options': ', '.join(product_parser.xpath('//div[@class="ux-labels-values col-12 ux-labels-values__column-last-row ux-labels-values--payments"]/div[2]/div/div//span/@aria-label')),
'Return Policy': product_parser.xpath('//div[@class="ux-layout-section ux-layout-section--returns"]//div[@class="ux-labels-values__values-content"]/div/span/text()')[0]
}
product_data.append(product_info)
except IndexError:
continue
# Sábháil le CSV
df = pd.DataFrame(product_data)
df.to_csv('ebay_product_data.csv', index=False)
print("Data saved to ebay_product_data.csv")
Ceadaíonn scríobadh eBay le Python bailiú sonraí éifeachtúla ar tháirgí, ar phraghsáil agus ar threochtaí. Sa treoir seo, chlúdaíomar liostaí scríobtha, láimhseálann muid, ceanntásca a leagan síos, agus úsáid a bhaint as seachvótálaithe chun braite a sheachaint. Cuimhnigh meas a bheith agat ar théarmaí seirbhíse eBay trí eatraimh iarratais fhreagracha agus uainíocht seachfhreastalaí a úsáid. Leis na huirlisí seo, is féidir leat sonraí eBay a bhailiú agus a anailísiú go héasca le haghaidh léargas margaidh. Scrapáil sona!
Tuairimí: 0