Research on Real-time E-commerce Price Comparison System Using Python Web Scraping Technology

Authors

  • Fan Chen

DOI:

https://doi.org/10.62051/ijcsit.v4n2.18

Keywords:

Python, Web Scraping, E-Commerce, Price Comparison, Real-time data, BeautifulSoup, Scrapy, Selenium, Anti-scraping measures

Abstract

This paper presents a study on the development of a real-time price comparison system for e-commerce platforms using Python-based web scraping technologies. The objective is to analyze and compare product prices across major e-commerce platforms, including Taobao, JD.com, and Amazon, by leveraging different web scraping libraries such as BeautifulSoup, Scrapy, and Selenium. The study outlines the technical challenges associated with dynamic content extraction, anti-scraping mechanisms, and the effectiveness of various scraping tools. Through experimental evaluation, we compare the scraping efficiency, success rate, and data accuracy across the three platforms. The results indicate that JD.com, with its mostly static content, offers the highest scraping efficiency when using BeautifulSoup, while Amazon, despite slower page loading, achieves a high success rate using Scrapy. Taobao, on the other hand, presents significant challenges due to its dynamic loading and strict anti-scraping measures, making Selenium the most appropriate tool, albeit with slower processing times. This research provides valuable insights into the selection of optimal web scraping techniques for real-time price comparison systems and highlights the impact of platform-specific challenges on the accuracy and efficiency of data extraction.

Downloads

Download data is not yet available.

References

[1] Armbrust, M., Fox, A., Griffith, R., Joseph, A. D., Katz, R. H., Konwinski, A., ... & Zaharia, M. (2010). A view of cloud computing. *Communications of the ACM*, *53*(4), 50-58. https://doi.org/10.1145/1721654.1721672

[2] Barrett, M., Davidson, E., Prabhu, J., & Vargo, S. L. (2015). Service innovation in the digital age: Key contributions and future directions. *MIS Quarterly*, *39*(1), 135-154. https://doi.org/10.25300/MISQ/2015/39.1.07

[3] Berman, S. J., & Marshall, A. (2014). The next digital transformation: From an individual-centered to an everyone-to-everyone economy. *Strategy & Leadership*, *42*(5), 9-17. https://doi.org/10.1108/SL-07-2014-0048

[4] Chesbrough, H. (2010). Business model innovation: Opportunities and barriers. *Long Range Planning*, *43*(2-3), 354-363. https://doi.org/10.1016/j.lrp.2009.07.010

[5] Iansiti, M., & Lakhani, K. R. (2014). Digital ubiquity: How connections, sensors, and data are revolutionizing business. *Harvard Business Review*, *92*(11), 90-99.

[6] Mell, P., & Grance, T. (2011). The NIST definition of cloud computing. *National Institute of Standards and Technology*, *53*(6), 50. https://doi.org/10.6028/NIST.SP.800-145

[7] Teece, D. J. (2010). Business models, business strategy and innovation. *Long Range Planning*, *43*(2-3), 172-194. https://doi.org/10.1016/j.lrp.2009.07.003

[8] Vargo, S. L., & Lusch, R. F. (2004). Evolving to a new dominant logic for marketing. *Journal of Marketing*, *68*(1), 1-17. https://doi.org/10.1509/jmkg.68.1.1.24036

[9] Weinberg, B. D., Parise, S., & Guinan, P. J. (2011). Social media usage and innovation: An empirical study of the role of social media in enhancing performance. *Journal of Organizational Computing and Electronic Commerce*, *21*(4), 361-382. https://doi.org/10.1080/10919392.2011.614787

[10] Zhang, X., & Li, H. (2017). Technology-Driven Innovation in E-Commerce: How Web Crawlers and Data Mining are Revolutionizing E-Commerce Platforms. *International Journal of Electronic Commerce*, *22*(3), 220-245. https://doi.org/10.1080/10864415.2017.1325706

Downloads

Published

10-10-2024

Issue

Section

Articles

How to Cite

Chen, F. (2024). Research on Real-time E-commerce Price Comparison System Using Python Web Scraping Technology. International Journal of Computer Science and Information Technology, 4(2), 127-136. https://doi.org/10.62051/ijcsit.v4n2.18