Python でローカルサーバーを立て CGI を動かす（3）

2020.07.03 プログラミング, Python

Python で映画の上映情報をスクレイピングして一覧表示する Webアプリを作ってみようと試しています。

前記事：

Python で映画館の上映一覧をスクレイピング（2） – IMUZA.com

前回でスクレイピングの基礎がわかりましたので次はサーバーを立てて Webアプリに一歩近づけようと思います。

ローカルサーバー
CGIスクリプトを実行する

01ローカルサーバー

ローカルサーバーはターミナルに次のコマンドを打てば起動します。

$ python -m http.server 8000
Serving HTTP on :: port 8000 (http://[::]:8000/) ...

ブラウザで http://localhost:8000/ にアクセスしますと、実行したディレクトリの一覧が表示されます。

実行したディレクトリに index.html があればそのファイルが表示されます。たとえば、index.html を次のように置いておき、すでにサーバーを立ち上げていればリロードすれば、「Hello Python!」と表示されます。

<html>
<body>
  <p>Hello Python!</p>
</body>
</html>

また、カレントディレクトリに imgフォルダをつくり画像ファイルを置き、index.html を次のようにすれば画像が表示されます。

<html>
  <head>
    <meta charset="utf-8">
  </head>
<body>
  <h1>罪と女王</h1>
  <img style="width:100%" src="img/queen-of-hearts.jpg"/></a>
</body>
</html>

ドキュメントはこちらです。

http.server — HTTP サーバ — Python 3.8.4rc1 ドキュメント

02CGIスクリプトを実行する

CGIスクリプトを実行するにはサーバーを --cgi オプション付きで起動する必要があります。

$ python3 -m http.server --cgi 8000
Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...

スクリプトは直下の cgi-bin ディレクトリの中に入れます。

ドキュメントはこちらにあります。

cgi — CGI (ゲートウェイインタフェース規格) のサポート — Python 3.7.8 ドキュメント

cgi-bin/sample.py 実行権限が必要です。

#!/usr/bin/python3


print("Content-Type: text/html")    # HTML is following
print()                             # blank line, end of headers
print("<TITLE>CGI script output</TITLE>")
print("<H1>This is my first CGI script</H1>")
print("Hello, world!")

http://localhost:8000/cgi-bin/sample.py にアクセスして表示されればOKです。

前記事でつくったミッドランドスクエアシネマの上映映画一覧を CGIスクリプトにして走らせてみます。

#!/usr/bin/python3
# -*- coding: UTF-8 -*-


import lxml
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
import chromedriver_binary
import datetime


options = Options()
options.add_argument('--headless')
driver = webdriver.Chrome(options=options)


today = datetime.date.today()
date = today + datetime.timedelta(days=2)
t_date = 's0100_0201_' + date.strftime('%Y%m%d')
d_date = str(date.month) + '/' + str(date.day)


url = 'https://ticket.online-midland-sq-cinema.jp/schedule/ticket/0201/index.html'


wait = WebDriverWait(driver, 10)


driver.get(url)


c_date = wait.until(EC.presence_of_element_located((By.ID, t_date)))
c_date.click()


wait.until(EC.text_to_be_present_in_element((By.ID, 'Day_title'), d_date))


html = driver.page_source
soup = BeautifulSoup(html, 'lxml')
found = soup.select('.MovieTitle1 h2')


print("Content-type: text/html\n")
print("<html>")
print('<head><meta charset="utf-8"></head>')
print("<h1>")
print(d_date)
print("</h1>")
print("<ul>")
for f in found:
  print("<li>")
  print(f.text)
  print("</li>")
print("</ul>")
print("</html>")
driver.quit()

おお！うまくいきました。

何となく先が見えてきました。

スッキリわかるPython入門スッキリわかるシリーズ

作者:国本大悟,須藤秋良,株式会社フレアリンク
発売日: 2019/06/13
メディア: Kindle版

#Python