Msx Wu
Msx Wu
Sep 7, 2018 · 7 min read

JAVA抓匯實務示範(2) 台灣證劵交易所_競價拍賣

規格單:

目標網址: http://www.twse.com.tw/zh/

事前分析:

可以不用做網頁抓取動作,從下載csv的URL判斷為規律性的變動型網址

下載檔案,網址為:

http://www.twse.com.tw/announcement/auction?response=csv&yy=[zyear]

將[zyaer]取代為系統年,如下列

http://www.twse.com.tw/announcement/auction?response=csv&yy=2018

先下載csv後做讀取動作即可

Coding:

HTMLDL dl;String file_name = “auction”; //define file name
String url = http://www.twse.com.tw/announcement/auction?response=csv&yy=[zyear].replace("[zyear]", year);//替換成系統年
//下載csv by URL
try{
dl = new HTMLDL( url, false);dl.setDownloadParameter(System.getProperty(“user.dir”)+ “\\data\\”, file_name + “.csv”);dl.download();}catch(Exception e){logger.error(ErrorTitle.DOWNLOAD_TITLE.getTitle(),e);}

以上為下載的步驟

public List<CSVRecord> readfileCsv(String path) {// System.out.println(“a=” + path);List<CSVRecord> list = new ArrayList<CSVRecord>();InputStreamReader fs = null;BufferedReader bfin = null;// try to read filetry {fs = new InputStreamReader(new FileInputStream(path), “BIG5”);// System.out.println(fs);bfin = new BufferedReader(fs);CSVFormat format = CSVFormat.DEFAULT.withSkipHeaderRecord().withDelimiter(‘,’);CSVParser parser = new CSVParser(bfin, format);list = parser.getRecords();} catch (Exception e) {logger.error(“讀取檔案失敗” + e + “\r\n”);} finally {try {// fs.close();bfin.close();} catch (IOException e) {e.printStackTrace();      }  }return list;}

這邊用的一個method,用作讀取給予路徑下的指定副檔名,回傳成一個List,只要符合指定副檔名,就會丟他的file name到list中

public static String spt = System.getProperty(“file.separator”);String path = System.getProperty(“user.dir”) + spt + “data”;List<String> files = numbers(path);
for (String filenm : files) {
List<CSVRecord> lines = readfileCsv(path);}

這邊用到一個新方法 readfileCsv(path)

public List<CSVRecord> readfileCsv(String path) {// System.out.println(“a=” + path);List<CSVRecord> list = new ArrayList<CSVRecord>();InputStreamReader fs = null;BufferedReader bfin = null;// try to read filetry {fs = new InputStreamReader(new FileInputStream(path), “BIG5”);// System.out.println(fs);bfin = new BufferedReader(fs);CSVFormat format = CSVFormat.DEFAULT.withSkipHeaderRecord().withDelimiter(‘,’);CSVParser parser = new CSVParser(bfin, format);list = parser.getRecords();} catch (Exception e) {logger.error(“讀取檔案失敗” + e + “\r\n”);} finally {try {// fs.close();bfin.close();} catch (IOException e) {e.printStackTrace();        }  }return list;}

就是做csv讀取的動作最後輸出為List<CSVRecord>型態

下一步要取出csv內容就是一直get下去

for (int i = 2; i < lines.size(); i++) {CSVRecord csvRecord = lines.get(i);}

這邊因為去掉前面i=0,i=1兩個我不需要的表頭 所以我從i=2開始取

lines.get(2)

取出來的樣子會像是

CSVRecord [comment=null, mapping=null, recordNumber=3, values=[1, 2018/09/07, 基士德-KY, 6641, 集中交易市場, 第一上市初上市, 美國標, 2018/09/03, 2018/09/05, 2,719, 62.5, 1, 365, 50, 400, 2018/09/21, 永豐金證券, 219,225,960, 4.5, 933, 14,627, 78.1, 96.96, 75, , ]]

也就是說我們迴圈做的就是將csv中一個橫行一個橫行做取出

而如果要繼續取用CSVRecord中各個細項的值 一樣就是get下去取

e.g 如果就剛剛那個CSVRecord作演示

CSVRecord csvRecord = lines.get(2);System.out.println(csvRecord.get(2));

輸出就直接是String型態的

基士德-KY

其他就以此類推去做 本身非常的直觀

接著就是看你要怎麼樣去存這些資料

一個私大資管學生的日誌本

學店跟頂大差在哪? 差在學店生的自卑

Msx Wu

Written by

Msx Wu

一個私大資管學生的日誌本

學店跟頂大差在哪? 差在學店生的自卑

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade