Skip to content

Commit 16d7f8e

Browse files
committed
Merged in changes which make CDN serving of static files easy, adds bower management of external static files, and transitions the datatracker to use django's staticfiles framework.
- Legacy-Id: 9976
2 parents b26efc8 + 82380e5 commit 16d7f8e

711 files changed

Lines changed: 12774 additions & 4466 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,4 +30,5 @@
3030
/lib
3131
/share
3232
/include
33+
/static
3334
/latest-coverage.json

README-CDN.rst

Lines changed: 183 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,183 @@
1+
================================================================================
2+
Serving Static Datatracker Files via a CDN
3+
================================================================================
4+
5+
Intro
6+
=====
7+
8+
With release 6.4.0, the way that the static files used by the datatracker are
9+
handled changes substantially. Static files were previously versioned under a
10+
top-level ``static/`` directory, but this is not the case any more. External
11+
files (such as for instance ``jquery.min.js``) are now placed under
12+
``ietf/externals/static/`` and updated using a tool called bower_, while
13+
datatracker-specific files (images, css, js, etc.) are located under
14+
``ietf/static/ietf/`` and ``ietf/secr/static/secr/`` respectively.
15+
16+
The following sections provide more details about handling of internals,
17+
externals, and how deployment is done.
18+
19+
20+
Serving Static Files via CDN
21+
============================
22+
23+
Production Mode
24+
---------------
25+
26+
If resources served over a CDN and/or with a high max-age don't have different
27+
URLs for different versions, then any component upgrade which is accompanied
28+
by a change in template functionality will have a long transition time
29+
during which the new pages are served with old components, with possible
30+
breakage. We want to avoid this.
31+
32+
The intention is that after a release has been checked out, but before it is
33+
deployed, the standard django 'collectstatic' management command will be
34+
run, resulting in all static files being collected from their working
35+
directory location and placed in an appropiate location for serving via CDN.
36+
This location will have the datatracker release version as part of its URL,
37+
so that after the deployment of a new release, the CDN will be forced to fetch
38+
the appropriate static files for that release.
39+
40+
An important part of this is to set up the ``STATIC_ROOT`` and ``STATIC_URL``
41+
settings appropriately. In 6.4.0, the setting is as follows in production
42+
mode::
43+
44+
STATIC_URL = "https://www.ietf.org/lib/dt/%s/"%__version__
45+
STATIC_ROOT = CDN_ROOT + "/a/www/www6s/lib/dt/%s/"%__version__
46+
47+
The result is that all static files collected via the ``collectstatic``
48+
command will be placed in a location served via CDN, with the release
49+
version being part of the URL.
50+
51+
Development Mode
52+
----------------
53+
54+
In development mode, ``STATIC_URL`` is set to ``/static/``, and Django's
55+
``staticfiles`` infrastructure makes the static files available under that
56+
local URL root (unless you set
57+
``settings.SERVE_CDN_FILES_LOCALLY_IN_DEV_MODE`` to ``False``). It is not
58+
necessary to actually populate the ``static/`` directory by running
59+
``collectstatic`` in order for static files to be served when running
60+
``ietf/manage.py runserver`` -- the ``runserver`` command has extra support
61+
for finding and serving static files without running collectstatic.
62+
63+
In order to work backwards from a file served in development mode to the
64+
location from which it is served, the mapping is as follows::
65+
66+
============================== ==============================
67+
Development URL Working copy location
68+
============================== ==============================
69+
localhost:8000/static/ietf/* ietf/static/ietf/*
70+
localhost:8000/static/secr/* ietf/secr/static/secr/*
71+
localhost:8000/static/* ietf/externals/static/*
72+
============================== ==============================
73+
74+
Handling of External Javascript and CSS Components
75+
==================================================
76+
77+
In order to make it easy to keep track of and upgrade external components,
78+
these are now handled by a tool called ``bower``, via a new management
79+
command ``bower_install``. Each external component is listed in a file
80+
``ietf/bower.json``. In order to update the version of a component listed in
81+
``ietf/bower.json``, or add a new one, you should edit ``bower.json``, and
82+
then run the management command::
83+
84+
$ ietf/manage.py bower_install
85+
86+
(Not surprisingly, you need to have bower_ installed in order to use this
87+
management command.)
88+
89+
That command will fetch the required version of each external component listed
90+
in ``bower.json`` (actually, it will do this for *all* ``bower.json`` files
91+
found in the ``static/`` directories of all ``INSTALLED_APPS`` and the
92+
directories in ``settings.STATICFILES_DIRS``), saving them temporarily under
93+
``.tmp/bower_components/``; it will then extract the relevant production
94+
``js`` and ``css`` files and place them in an appropriately named directory
95+
under ``ietf/externals/static/``. The latter location is taken from
96+
``COMPONENT_ROOT`` in ``settings.py``.
97+
98+
Managing external components via bower has the additional benefit of
99+
managing dependencies -- components that have dependencies will pull in
100+
these, so that they also are placed under ``ietf/externals/static/``.
101+
You still have to manually add the necessary stylesheet and/or javascript
102+
references to your templates, though.
103+
104+
The ``bower_install`` command is not run automatically by ``bin/mkrelease``,
105+
since it needs an updated ``bower.json`` in order to do anything interesting.
106+
So when you're intending to update an external web asset to a newer version,
107+
you need to edit the ``bower.json`` file, run ``manage.py bower_install``,
108+
verify that the new version doesn't break things, and then commit the new
109+
files under ``ietf/externals/static/`` and the updated ``bower.json``.
110+
111+
.. _bower: http://bower.io/
112+
113+
The ``ietf/externals/static/`` Directory
114+
-----------------------------------------
115+
116+
The directory ``ietf/externals/static/`` holds a number of subdirectories
117+
which hold distribution files for external client-side components, collected
118+
by ``bower_install`` as described above. Currently
119+
(01 Aug 2015) this means ``js`` and ``css`` components and fonts.
120+
121+
These components each reside in their own subdirectory, which is named with
122+
the component name::
123+
124+
henrik@zinfandel $ ls -l ietf/externals/static/
125+
total 40
126+
drwxr-xr-x 6 henrik henrik 4096 Jul 25 15:25 bootstrap
127+
drwxr-xr-x 4 henrik henrik 4096 Jul 25 15:25 bootstrap-datepicker
128+
drwxr-xr-x 4 henrik henrik 4096 Jul 25 15:25 font-awesome
129+
drwxr-xr-x 2 henrik henrik 4096 Jul 25 15:25 jquery
130+
drwxr-xr-x 2 henrik henrik 4096 Jul 25 15:25 jquery.cookie
131+
drwxr-xr-x 2 henrik henrik 4096 Jul 25 15:24 ptmono
132+
drwxr-xr-x 2 henrik henrik 4096 Jul 25 15:24 ptsans
133+
drwxr-xr-x 2 henrik henrik 4096 Jul 25 15:24 ptserif
134+
drwxr-xr-x 2 henrik henrik 4096 Jul 25 15:25 select2
135+
drwxr-xr-x 2 henrik henrik 4096 Jul 25 15:25 select2-bootstrap-css
136+
137+
The ``pt*`` fonts are an exception, in that there is no bower component
138+
available for these fonts, so they have been put in place manually.
139+
140+
141+
Handling of Internal Static Files
142+
=================================
143+
144+
Previous to this release, internal static files were located under
145+
``static/``, mixed together with the external components. They are now
146+
located under ``ietf/static/ietf/`` and ``ietf/secr/static/secr``, and will be
147+
collected for serving via CDN by the ``collectstatic`` command. Any static
148+
files associated with a particular app will be handled the same way (which
149+
means that all ``admin/`` static files automatically will be handled correctly, too).
150+
151+
Handling of Customised Bootstrap Files
152+
======================================
153+
154+
We are using a customised version of Bootstrap_, which is handled specially,
155+
by a SVN externals definition in ``ietf/static/ietf``. That pulls the content
156+
of the ``bootstrap/dist/`` directory (which is generated by running ``grunt``
157+
in the ``bootstrap/`` directory) into ``ietf/static/ietf/bootstrap``, from
158+
where it is collected by ``collectstatic``.
159+
160+
Changes to Template Files
161+
=========================
162+
163+
In order to make the template files refer to the correct versioned CDN URL
164+
(as given by the STATIC_URL root) all references to static files in the
165+
templates have been updated to use the ``static`` template tag when referring
166+
to static files. This will automatically result in both serving static files
167+
from the right place in development mode, and referring to the correct
168+
versioned URL in production mode and the simpler ``/static/`` urls in
169+
development mode.
170+
171+
.. _bootstrap: http://getbootstrap.com/
172+
173+
Deployment
174+
==========
175+
176+
During deployment, it is now necessary to run the management command::
177+
178+
$ ietf/manage.py collectstatic
179+
180+
before activating a new release.
181+
182+
The deployment ``README`` file at ``/a/www/ietf-datatracker/README`` has been
183+
updated accordingly.

bin/mkrelease

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -166,6 +166,10 @@ DEV="$(printf %d.%d.%d.dev0 $MAJOR $MINOR $NEXT)"
166166

167167
#cd $DIR ??
168168

169+
note "Collecting static files"
170+
$do ietf/manage.py collectstatic --noinput --ignore=bower.json --ignore='README.*'
171+
$do svn commit static/lib/ -m "Updated static files under static/lib/"
172+
169173
# note "Checking that there's a recent test-crawler log"
170174
# touch -d $RDATE .svn/.latest-commit
171175
# TCLOG=$(ls -t ../test-crawl-*.log | head -n 1)

bin/test-crawl

Lines changed: 51 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,13 @@ parser.add_argument('--validator-nu', dest='validator_nu', action='store_true',
2727
help='Use validator.nu instead of html5lib for HTML validation')
2828
parser.add_argument('--pedantic', action='store_true',
2929
help='Stop the crawl on the first HTML validation issue')
30+
parser.add_argument('--random', action='store_true',
31+
help='Crawl URLs randomly')
3032
parser.add_argument('--validate-all', dest='validate_all', action='store_true', default=False,
3133
help='Run html 5 validation on all pages, without skipping similar urls. '
3234
'(The default is to only run validation on one of /foo/1/, /foo/2/, /foo/3/, etc.)')
35+
parser.add_argument('-v', '--verbose', action='store_true', default=False,
36+
help='Be more verbose')
3337

3438
args = parser.parse_args()
3539

@@ -59,13 +63,18 @@ MAX_URL_LENGTH = 500
5963

6064
# --- Functions ---
6165

66+
def note(s):
67+
if args.verbose:
68+
sys.stderr.write(s)
69+
sys.stderr.write('\n')
70+
6271
def strip_url(url):
6372
if url.startswith("http://testserver"):
6473
url = url[len("http://testserver"):]
6574
return url
6675

6776
def extract_html_urls(content):
68-
for m in re.finditer(r'(<(?:a|link) [^>]*href=[\'"]([^"]+)[\'"][^>]*>)', content):
77+
for m in re.finditer(r'(<(?:(?:a|link) [^>]*href|(?:img|script) [^>]*src)=[\'"]([^"]+)[\'"][^>]*>)', content):
6978
if re.search(r'rel=["\']?nofollow["\']', m.group(1)):
7079
continue
7180

@@ -108,20 +117,36 @@ def check_html_valid(url, response, args):
108117
key = url
109118
if not args.validate_all:
110119
# derive a key for urls like this by replacing primary keys
111-
key = re.sub("/[0-9.]+/", "/nnnn/", key)
112-
key = re.sub("/.+@.+/", "/x@x.org/", key)
113-
key = re.sub("#.*$", "", key)
114120
key = re.sub("\?.*$", "", key)
121+
key = re.sub("#.*$", "", key)
122+
key = re.sub("/.+@.+/", "/x@x.org/", key)
123+
key = re.sub("/[0-9.]+/", "/nnnn/", key)
124+
key = re.sub("/[0-9.]+/", "/mmmm/", key)
125+
key = re.sub("/ag/[a-z0-9-]+/", "/ag/foo/", key)
126+
key = re.sub("/area/[a-z0-9-]+/", "/area/foo/", key)
127+
key = re.sub("/bcp[0-9]+/", "/bcpnnn/", key)
128+
key = re.sub("/conflict-review-[a-z0-9-]+/", "/conflrev-foo/", key)
129+
key = re.sub("/dir/[a-z0-9-]+/", "/dir/foo/", key)
130+
key = re.sub("/draft-[a-z0-9-]+/", "/draft-foo/", key)
131+
key = re.sub("/group/[a-z0-9-]+/", "/group/foo/", key)
132+
key = re.sub("/ipr/search/.*", "/ipr/search/", key)
133+
key = re.sub("/release/[0-9dev.]+/", "/release/n.n.n/", key)
115134
key = re.sub("/rfc[0-9]+/", "/rfcnnnn/", key)
116-
key = re.sub("/wg/[a-z0-9-]+/", "/wg/foo/", key)
117135
key = re.sub("/rg/[a-z0-9-]+/", "/rg/foo/", key)
118-
key = re.sub("/ipr/[0-9]+/", "/ipr/nnnn/", key)
119-
key = re.sub("/draft-[a-z0-9-]+/", "/draft-foo/", key)
136+
key = re.sub("/secr/srec/nnnn/[0-9a-z-]+/", "/secr/sreq/nn/bar/", key)
137+
key = re.sub("/state/[a-z0-9-]+/", "/state/foo/", key)
138+
key = re.sub("/state/[a-z0-9-]+/[a-z0-9-]+/", "/state/foo/bar/", key)
139+
key = re.sub("/status-change-[a-z0-9-]+/", "/statchg-foo/", key)
140+
key = re.sub("/std[0-9]+/", "/stdnnn/", key)
141+
key = re.sub("/submit/status/nnnn/[0-9a-f]+/", "/submit/status/nnnn/bar/", key)
142+
key = re.sub("/team/[a-z0-9-]+/", "/team/foo/", key)
143+
key = re.sub("/wg/[a-z0-9-]+/", "/wg/foo/", key)
144+
120145
for slug in doc_types:
121146
key = re.sub("/%s-.*/"%slug, "/%s-nnnn/"%slug, key)
122147

123148
if not key in validated_urls:
124-
149+
note('Validate: %-32s: %s' % (url[:32], key))
125150
# These URLs have known issues, skip them until those are fixed
126151
if re.search('(/secr|admin/)|/doc/.*/edit/info/', url):
127152
log("%s blacklisted; skipping HTML validation" % url)
@@ -159,6 +184,14 @@ def check_html_valid(url, response, args):
159184
(pos, code))
160185
warnings += 1
161186

187+
def skip_url(url):
188+
for pattern in (
189+
"^/community/[0-9]+/remove_document/",
190+
"^/community/personal/",
191+
):
192+
if re.search(pattern, url):
193+
return True
194+
return False
162195

163196
def log(s):
164197
print(s)
@@ -237,12 +270,18 @@ if __name__ == "__main__":
237270
sys.exit(1)
238271

239272
while urls:
240-
# popitem() is documented to be random, but really isn't
241-
url = random.choice(urls.keys())
242-
referrer = urls.pop(url)
273+
if args.random:
274+
# popitem() is documented to be random, but really isn't
275+
url = random.choice(urls.keys())
276+
referrer = urls.pop(url)
277+
else:
278+
url, referrer = urls.popitem()
243279

244280
visited.add(url)
245281

282+
if skip_url(url):
283+
continue
284+
246285
try:
247286
timestamp = datetime.datetime.now()
248287
r = client.get(url, secure=True, follow=True)
@@ -298,7 +337,7 @@ if __name__ == "__main__":
298337
log("=============")
299338

300339
else:
301-
tags.append(u"FAIL for %s\n (from %s)" % (url, referrer))
340+
tags.append(u"FAIL (from %s)" % (referrer, ))
302341
errors += 1
303342

304343
if elapsed.total_seconds() > slow_threshold:

0 commit comments

Comments
 (0)