excel-vba page automatically - Scraping data from website using vba
Im trying to scrape data from website: http://uk.investing.com/rates-bonds/financial-futures via vba, like real-time price, i.e. German 5 YR Bobl, US 30Y T-Bond, i have tried excel web query but it only scrapes the whole website, but I would like to scrape the rate only, is there a way of doing this?
Scraping data from website using vba
There are several ways of doing this. This is an answer that I write hoping that all the basics of Internet Explorer automation will be found when browsing for the keywords "scraping data from website", but remember that nothing's worth as your own research (if you don't want to stick to pre-written codes that you're not able to customize).
Please note that this is one way, that I don't prefer in terms of performance (since it depends on the browser speed) but that is good to understand the rationale behind Internet automation.
1) If I need to browse the web, I need a browser! So I create an Internet Explorer browser:
Dim appIE As Object Set appIE = CreateObject("internetexplorer.application")
2) I ask the browser to browse the target webpage. Through the use of the property ".Visible", I decide if I want to see the browser doing its job or not. When building the code is nice to have
Visible = True, but when the code is working for scraping data is nice not to see it everytime so
Visible = False.
With appIE .Navigate "http://uk.investing.com/rates-bonds/financial-futures" .Visible = True End With
3) The webpage will need some time to load. So, I will wait meanwhile it's busy...
Do While appIE.Busy DoEvents Loop
4) Well, now the page is loaded. Let's say that I want to scrape the change of the US30Y T-Bond: What I will do is just clicking F12 on Internet Explorer to see the webpage's code, and hence using the pointer (in red circle) I will click on the element that I want to scrape to see how can I reach my purpose.
5) What I should do is straight-forward. First of all, I will get by the ID property the
tr element which is containing the value:
Set allRowOfData = appIE.document.getElementById("pair_8907")
Here I will get a collection of
td elements (specifically,
tr is a row of data, and the
td are its cells. We are looking for the 8th, so I will write:
Dim myValue As String: myValue = allRowOfData.Cells(7).innerHTML
Why did I write 7 instead of 8? Because the collections of cells starts from 0, so the index of the 8th element is 7 (8-1). Shortly analysing this line of code:
.Cells()makes me access the
innerHTMLis the property of the cell containing the value we look for.
Once we have our value, which is now stored into the
myValue variable, we can just close the IE browser and releasing the memory by setting it to Nothing:
appIE.Quit Set appIE = Nothing
Well, now you have your value and you can do whatever you want with it: put it into a cell (
Range("A1").Value = myValue), or into a label of a form (
Me.label1.Text = myValue).
I'd just like to point you out that this is not how works: here you post questions about specific coding problems, but you should make your own search first. The reason why I'm answering a question which is not showing too much research effort is just that I see it asked several times and, back to the time when I learned how to do this, I remember that I would have liked having some better support to get started with. So I hope that this answer, which is just a "study input" and not at all the best/most complete solution, can be a support for next user having your same problem. Because I have learned how to program thanks to this community, and I like to think that you and other beginners might use my input to discover the beautiful world of programming.
Enjoy your practice ;)
This question asked long before. But I thought following information will useful for newbies. Actually you can easily get the values from class name like this.
Sub ExtractLastValue() Set objIE = CreateObject("InternetExplorer.Application") objIE.Top = 0 objIE.Left = 0 objIE.Width = 800 objIE.Height = 600 objIE.Visible = True objIE.Navigate ("https://uk.investing.com/rates-bonds/financial-futures/") Do DoEvents Loop Until objIE.readystate = 4 MsgBox objIE.document.getElementsByClassName("pid-8907-last")(0).innerText End Sub
And if you are new to web scraping please read this blog post.
And also there are various techniques to extract data from web pages. This article explain few of them with examples.
you can use winhttprequest object instead of internet explorer as it's good to load data excluding pictures n advertisement instead of downloading full webpage including advertisement n pictures those make internet explorer object heavy compare to winhttpRequest object.
Website data table scraper
If you want to use Excel functions to navigate the tables why not dump the tables first onto a worksheet this code works for me
Option Explicit Sub Scraper() Dim appIE As Object ' As per http://.com/questions/27066963/scraping-data-from-website-using-vba Set appIE = CreateObject("internetexplorer.application") With appIE .Navigate "http://uk.investing.com/rates-bonds/financial-futures" 'Sample page .Visible = True End With Do While appIE.Busy DoEvents Application.Wait (Now + TimeValue("0:00:01")) 'If page not open, wait a second befor trying again Loop 'Debug.Print TypeName(appIE.document) Dim doc As Object 'MSHTML.HTMLDocument Set doc = appIE.document '* appIE busy is good but you need to wait for the whole document to completely load and initialise so use this While doc.readyState <> "complete" DoEvents Wend '* we can select all the tables because they share the same CSS class name Dim tablesSelectedByClass As Object 'MSHTML.HTMLElementCollection Set tablesSelectedByClass = doc.getElementsByClassName("genTbl") '* you can change this, it was just convenient for me to add sheets to my workbook Dim shNewResults As Excel.Worksheet Set shNewResults = ThisWorkbook.Worksheets.Add Dim lRowCursor As Long '* this controls pasting down the sheet lRowCursor = 1 Dim lTableIndexLoop As Long For lTableIndexLoop = 0 To tablesSelectedByClass.Length - 1 Dim tableLoop As Object 'MSHTML.HTMLTable Set tableLoop = tablesSelectedByClass.Item(lTableIndexLoop) If LenB(tableLoop.ID) > 0 Then '* there are some extra nonsense tables, this subselects Dim sParentColumn As String, objParentColumn As Object ' MSHTML.HTMLSemanticElement Set objParentColumn = FindMyColumn(tableLoop, sParentColumn) '* need to understand is table on left hand or right hand side Dim vHeader As Variant: vHeader = Empty If sParentColumn = "leftColumn" Then '* tables on the left have a preceding H3 element with the table's description Dim objH3Headers As Object Set objH3Headers = objParentColumn.getElementsByTagName("H3") vHeader = objH3Headers.Item(lTableIndexLoop).innerText Else '* tables on the right have a hidden attribute we can use vHeader = tableLoop.Attributes.Item("data-gae").Value If Len(vHeader) > 3 Then vHeader = Mid$(vHeader, 4) Mid$(vHeader, 1, 1) = Chr(Asc(Mid$(vHeader, 1, 1)) - 32) End If End If '* tables on the right do not have column headers Dim bHasColumnHeaders As Boolean bHasColumnHeaders = (tableLoop.ChildNodes.Length = 2) Dim vTableCells() As Variant '* this will be our table data container which we will paste in one go Dim lRowCount As Long: lRowCount = 0 Dim lColumnCount As Long: lColumnCount = 0 Dim lDataHeadersSectionIdx As Long: lDataHeadersSectionIdx = 0 Dim objColumnHeaders As Object: Set objColumnHeaders = Nothing If bHasColumnHeaders Then Set objColumnHeaders = tableLoop.ChildNodes.Item(0).ChildNodes.Item(0) lRowCount = lRowCount + 1 lDataHeadersSectionIdx = 1 Else lDataHeadersSectionIdx = 0 End If Dim objDataRows As Object 'MSHTML.HTMLElementCollection Set objDataRows = tableLoop.ChildNodes.Item(lDataHeadersSectionIdx).ChildNodes lColumnCount = objDataRows.Item(0).ChildNodes.Length lRowCount = lRowCount + objDataRows.Length ReDim vTableCells(1 To lRowCount, 1 To lColumnCount) As Variant '* we have them get the column headers Dim lColLoop As Long If bHasColumnHeaders Then For lColLoop = 1 To lColumnCount vTableCells(1, lColLoop) = objColumnHeaders.ChildNodes.Item(lColLoop - 1).innerText Next End If '* get the data cells Dim lRowLoop As Long For lRowLoop = 1 To lRowCount - VBA.IIf(bHasColumnHeaders, 1, 0) For lColLoop = 1 To lColumnCount vTableCells(lRowLoop + VBA.IIf(bHasColumnHeaders, 1, 0), lColLoop) = objDataRows.Item(lRowLoop - 1).ChildNodes.Item(lColLoop - 1).innerText Next Next '* paste our table description shNewResults.Cells(lRowCursor, 1).Value2 = vHeader lRowCursor = lRowCursor + 1 '* paste our table data shNewResults.Cells(lRowCursor, 1).Resize(lRowCount, lColumnCount).Value2 = vTableCells lRowCursor = lRowCursor + lRowCount + 1 End If Next End Sub Function FindMyColumn(ByVal node As Object, ByRef psColumn As String) As Object '* this code ascends the DOM looking for "column" in the id of each node While InStr(1, node.ID, "column", vbTextCompare) = 0 And Not node.ParentNode Is Nothing DoEvents Set node = node.ParentNode Wend If InStr(1, node.ID, "column", vbTextCompare) > 0 Then Set FindMyColumn = node psColumn = CStr(node.ID) End If End Function
By the way, if you trade a lot the brokers get rich and you get poor, brokerage charges really impact in long run.