फसल और एक्सपैथ के साथ एक टेबल को कैसे स्क्रैप करें? <a href="https://i.stack.imgur.com/ak9iH.png" rel="noreferrer"><img src="https://i.stack.imgur.com/ak9iH.png" alt="enter image description here"></a> लिंक और xpath पहले से ही शामिल हैं:

निम्नलिखित documentation मैं marketwatch.comफसल और एक्सपैथ के साथ एक टेबल को कैसे स्क्रैप करें? <a href="https://i.stack.imgur.com/ak9iH.png" rel="noreferrer"><img src="https://i.stack.imgur.com/ak9iH.png" alt="enter image description here"></a> लिंक और xpath पहले से ही शामिल हैं:

यहां से तालिकाओं की एक श्रृंखला स्क्रैप करने का प्रयास कर रहे हैं का उपयोग कर bellow कोड का प्रतिनिधित्व करती है

url <- "http://www.marketwatch.com/investing/stock/IRS/profile" 
valuation <- url %>% 
    html() %>% 
    html_nodes(xpath='//*[@id="maincontent"]/div[2]/div[1]') %>% 
    html_table() 
valuation <- valuation[[1]]

मैं निम्नलिखित त्रुटि मिलती है: कोड में

Warning message: 
'html' is deprecated. 
Use 'read_html' instead. 
See help("Deprecated")

अग्रिम धन्यवाद।

स्रोत

2016-02-29 Alex Bădoi

'html()' को हटाएं और 'read_html()' – cory

के साथ प्रतिस्थापित करें जो कोई त्रुटि नहीं है, यह एक चेतावनी है। आपका कोड अभी भी उस चेतावनी के साथ चलाएगा। – SymbolixAU

धन्यवाद। तय की। –

वह वेबसाइट एक HTML तालिका का उपयोग नहीं करती है, इसलिए html_table() कुछ भी नहीं मिला। यह actaully div कक्षा column और data lastcolumn का उपयोग करता है।

तो तुम जैसे

url <- "http://www.marketwatch.com/investing/stock/IRS/profile" 
valuation_col <- url %>% 
    read_html() %>% 
    html_nodes(xpath='//*[@class="column"]') 

valuation_data <- url %>% 
    read_html() %>% 
    html_nodes(xpath='//*[@class="data lastcolumn"]')

या यहां तक कि

url %>% 
    read_html() %>% 
    html_nodes(xpath='//*[@class="section"]')

कुछ कर सकते हैं आप जिस तरह से ज्यादातर वहां जाने के लिए।

कृपया उनके terms of use - विशेष रूप से 3.4 भी पढ़ें।

स्रोत

2016-03-01 00:30:14 SymbolixAU

उत्तर

संबंधित मुद्दे