Skip to content

Commit ba0e68a

Browse files
committed
feat: issue2551068 - Provide way to retrieve file/msg data via rest endpoint.
Use Allow header to change format of /binary_content endpoint. If Allow header for endpoint is not application/json, it will be matched against the mime type for the file. */*, text/* are supported and will return the native mime type if present. Changes: move */* mime type from static dict of supported types. It was hardcoded to return json only. Now it can return a matching non-json mime type for the /binary_content endpoint. Edited some errors to explicitly add */* mime type. Cleanups to use ', ' separation in lists of valid mime types rather than just space separated. Remove ETag header when sending raw content. See issue 2551375 for background. Doc added to rest.txt. Small format fix up (add dash) in CHANGES.txt. Make passing an unset/None/False accept_mime_type to format_dispatch_output a 500 error. This used to be the fallback to produce a 406 error after all processing had happened. It should no longer be possible to take that code path as all 406 errors (with valid accept_mime_types) are generated before processing takes place. Make format_dispatch_output handle output other than json/xml so it can send back binary_content data. Removed a spurious client.response_code = 400 that seems to not be used. Tests added for all code paths. Database setup for tests msg and file entry. This required a file upload test to change so it doesn't look for file1 as the link returned by the upload. Download the link and verify the data rather than verifying the link. Multiple formatting changes to error messages to make all lists of valid mime types ', ' an not just space separated.
1 parent a44e4c8 commit ba0e68a

File tree

4 files changed

+492
-20
lines changed

4 files changed

+492
-20
lines changed

CHANGES.txt

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ Features:
4747
- issue2551315 - Document use of
4848
RestfulInstance.max_response_row_size to limit data returned
4949
from rest request.
50-
- issue2551330 Add an optional 'filter' function to the Permission
50+
- issue2551330 - Add an optional 'filter' function to the Permission
5151
objects and the addPermission method. This is used to optimize search
5252
performance by not checking items returned from a database query
5353
one-by-one (using the check function) but instead offload the
@@ -60,6 +60,11 @@ Features:
6060
address. This logs the actual client address when
6161
roundup-server is run behind a reverse proxy. It also appends a
6262
+ sign to the logged address/name. (John Rouillard)
63+
- issue2551068 - Provide way to retrieve file/msg data via rest
64+
endpoint. Raw file/msg data can be retrieved using the
65+
/binary_content attribute and an Accept header to select the mime
66+
type for the data (e.g. image/png for a png file). The existing html
67+
interface method still works and is supported, but is legacy.
6368

6469
2024-07-13 2.4.0
6570

doc/rest.txt

Lines changed: 83 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -368,7 +368,7 @@ extension ``.json`` or ``.xml`` to the path component of the url. This
368368
will force json or xml (if supported) output. If you use an extension
369369
it takes priority over any accept headers. Note the extension does not
370370
work for the ``/rest`` or ``/rest/data`` paths. In these cases it
371-
returs a 404 error. Adding the header ``Accept: application/xml``
371+
returns a 404 error. Adding the header ``Accept: application/xml``
372372
allows these paths to return xml data.
373373

374374
The rest interface returns status 406 if you use an unrecognized
@@ -378,6 +378,8 @@ the accept header are available or if the accept header is invalid.
378378
Note: ``dicttoxml2.py`` is an updated version of ``dicttoxml.py``. If
379379
you are still using Python 2.7 or 3.6, you can use ``dicttoxml.py``.
380380

381+
Also the ``/binary_content`` attribute endpoint can be used to
382+
retrieve raw file data in many formats.
381383

382384
General Guidelines
383385
------------------
@@ -906,6 +908,86 @@ You can retreive a message with a url like
906908
}
907909
}
908910

911+
With Roundup 2.5 you can retrieve the data directly from the rest
912+
interface using the ``Accept`` header value to select a structured (json
913+
or optional xml) representation (as above) or a stream with just the
914+
content data.
915+
916+
Using the wildcard type ``*/*`` in the ``Accept`` header with the url
917+
``.../binary_content`` will return the raw data and the recorded mime
918+
type of the the data as the ``Content-Type``. Using ``*/*`` with
919+
another end point will return ``json`` data. An ``Accept`` value of
920+
``application/octet-stream`` matches any mime type and retrieves the
921+
raw data as ``Content-Type: application/octet-stream``.
922+
923+
To access the contents of a PNG image file (in file23), you use the
924+
following link:
925+
``https://.../demo/rest/data/file/23/binary_content``. To find out the
926+
mime type, you can check this URL:
927+
``https://.../demo/rest/data/file/23/type``.
928+
929+
By setting the header to ``Accept: application/octet-stream; q=1.0,
930+
application/json; q=0.5``, you will receive the binary PNG file with
931+
the header ``Content-Type: application/octet-stream``. If you switch
932+
the ``q`` values, you will receive the encoded JSON version::
933+
934+
{
935+
"data": {
936+
"id": "23",
937+
"type": "<class 'bytes'>",
938+
"link": "https://.../demo/rest/data/file/23/binary_content",
939+
"data": "b'\\x89PNG\\r\\n\\x1a\\n\\x00[...]0\\x00\\x00\\x00IEND\\xaeB`\\x82'",
940+
"@etag": "\"db6adc1b09d95b0388d79c7905bc7982\""
941+
}
942+
}
943+
944+
with ``Content-Type: application/json`` and a (4x larger) json encoded
945+
representation of the binary data.
946+
947+
If you want it returned with a ``Content-Type: image/png`` header,
948+
you can use ``image/png`` or ``*/*`` in the Accept header.
949+
950+
For message files, you can use
951+
``https://.../demo/rest/data/msg/23/binary_content`` with ``Accept:
952+
application/octet-stream; q=0.5, application/json; q=0.4, image/png;
953+
q=0.495, text/*``. It will return the plain text of the message.
954+
955+
Most message files are not stored with a mime type. Getting
956+
``https://.../demo/rest/data/msg/23/type`` returns::
957+
958+
{
959+
"data": {
960+
"id": "23",
961+
"type": "<class 'NoneType'>",
962+
"link": "https://.../demo/rest/data/msg/23/type",
963+
"data": null,
964+
"@etag": "\"ba98927a8bb4c56f6cfc31a36f94ad16\""
965+
}
966+
}
967+
968+
The data attribute will usually be null/empty. As a result, mime type
969+
matching for an item without a mime type is forgiving.
970+
971+
Messages are meant to be human readable, so the mime type ``text/*``
972+
can be used to access any text style mime type (``text/plain``,
973+
``text/x-rst``, ``text/markdown``, ``text/html``, ...) or an empty
974+
mime type. If the item's type is not empty, it will be used as the
975+
Content-Type (similar to ``*/*``). Otherwise ``text/*`` will be the
976+
Content-Type. If your tracker supports markup languages
977+
(e.g. markdown), you should set the mime type (e.g. ``text/markdown``)
978+
when storing your message.
979+
980+
Note that the header ``X-Content-Type-Options: nosniff`` is returned
981+
with a non javascript or xml binary_content response to prevent the
982+
browser from trying to interpret the returned data.
983+
984+
Legacy Method (HTML interface)
985+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
986+
987+
With the addition of file binary content streaming in the rest
988+
interface to Roundup 2.5.0, this method (using the html interface) is
989+
considered legacy but still works.
990+
909991
To retreive the content, you can use the content link property:
910992
``https://.../demo/msg11/``. The trailing / is required. Without the
911993
/, you get a web page that includes metadata about the message. With

roundup/rest.py

Lines changed: 83 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -433,7 +433,6 @@ class RestfulInstance(object):
433433
__default_patch_op = "replace" # default operator for PATCH method
434434
__accepted_content_type = {
435435
"application/json": "json",
436-
"*/*": "json",
437436
}
438437
__default_accept_type = "json"
439438

@@ -2232,6 +2231,8 @@ def determine_output_format(self, uri):
22322231
3) if empty or missing Accept header
22332232
return self.__default_accept_type
22342233
4) match and return best Accept header/version
2234+
this includes matching mime types for file downloads
2235+
using the binary_content property
22352236
if version error found in matching type return 406 error
22362237
5) if no requested format is supported return 406
22372238
error
@@ -2281,21 +2282,55 @@ def determine_output_format(self, uri):
22812282
self.client.response_code = 406
22822283
return (None, uri, self.error_obj(
22832284
400, _("Unable to parse Accept Header. %(error)s. "
2284-
"Acceptable types: %(acceptable_types)s") % {
2285+
"Acceptable types: */*, %(acceptable_types)s") % {
22852286
'error': e.args[0],
2286-
'acceptable_types': " ".join(sorted(
2287+
'acceptable_types': ", ".join(sorted(
22872288
self.__accepted_content_type.keys()))}))
22882289

22892290
if not accept_header:
22902291
# we are using the default
22912292
return (self.__default_accept_type, uri, None)
22922293

22932294
accept_type = ""
2295+
valid_binary_content_types = []
2296+
if uri.endswith("/binary_content"):
2297+
request_path = uri
2298+
request_class, request_id = request_path.split('/')[-3:-1]
2299+
try:
2300+
designator_type = self.db.getclass(
2301+
request_class).get(request_id, "type")
2302+
except (KeyError, IndexError):
2303+
# class (KeyError) or
2304+
# id (IndexError) does not exist
2305+
# Return unknown mime type and no error.
2306+
# The 400/404 error will be thrown by other code.
2307+
return (None, uri, None)
2308+
2309+
if designator_type:
2310+
# put this first as we usually require exact mime
2311+
# type match and this will be matched most often.
2312+
# Also for text/* Accept header it will be returned.
2313+
valid_binary_content_types.append(designator_type)
2314+
2315+
if not designator_type or designator_type.startswith('text/'):
2316+
# allow text/* as msg items can have empty type field
2317+
# also match text/* for text/plain, text/x-rst,
2318+
# text/markdown, text/html etc.
2319+
valid_binary_content_types.append("text/*")
2320+
2321+
# Octet-stream should be allowed for any content.
2322+
# client.py sets 'X-Content-Type-Options: nosniff'
2323+
# for file downloads (sendfile) via the html interface,
2324+
# so we should be able to set it in this code as well.
2325+
valid_binary_content_types.append("application/octet-stream")
2326+
22942327
for part in accept_header:
22952328
if accept_type:
22962329
# we accepted the best match, stop searching for
22972330
# lower quality matches.
22982331
break
2332+
2333+
# check for structured rest return types (json xml)
22992334
if part[0] in self.__accepted_content_type:
23002335
accept_type = self.__accepted_content_type[part[0]]
23012336
# Version order:
@@ -2311,6 +2346,8 @@ def determine_output_format(self, uri):
23112346
# use default if version = None
23122347
try:
23132348
self.api_version = int(part[1]['version'])
2349+
if self.api_version not in self.__supported_api_versions:
2350+
raise ValueError
23142351
except KeyError:
23152352
self.api_version = None
23162353
except (ValueError, TypeError):
@@ -2323,17 +2360,45 @@ def determine_output_format(self, uri):
23232360
return (None, uri,
23242361
self.error_obj(406, msg))
23252362

2363+
if part[0] == "*/*":
2364+
if valid_binary_content_types:
2365+
self.client.setHeader("X-Content-Type-Options", "nosniff")
2366+
accept_type = valid_binary_content_types[0]
2367+
else:
2368+
accept_type = "json"
2369+
2370+
# check type of binary_content
2371+
if part[0] in valid_binary_content_types:
2372+
self.client.setHeader("X-Content-Type-Options", "nosniff")
2373+
accept_type = part[0]
2374+
# handle text wildcard
2375+
if ((part[0] in 'text/*') and
2376+
"text/*" in valid_binary_content_types):
2377+
self.client.setHeader("X-Content-Type-Options", "nosniff")
2378+
# use best choice of mime type, try not to use
2379+
# text/* if there is a real text mime type/subtype.
2380+
accept_type = valid_binary_content_types[0]
2381+
23262382
# accept_type will be empty only if there is an Accept header
23272383
# with invalid values.
23282384
if accept_type:
23292385
return (accept_type, uri, None)
23302386

2331-
self.client.response_code = 400
2387+
if valid_binary_content_types:
2388+
return (None, uri,
2389+
self.error_obj(
2390+
406,
2391+
_("Requested content type(s) '%s' not available.\n"
2392+
"Acceptable mime types are: */*, %s") %
2393+
(self.client.request.headers.get('Accept'),
2394+
", ".join(sorted(
2395+
valid_binary_content_types)))))
2396+
23322397
return (None, uri,
23332398
self.error_obj(
23342399
406,
23352400
_("Requested content type(s) '%s' not available.\n"
2336-
"Acceptable mime types are: %s") %
2401+
"Acceptable mime types are: */*, %s") %
23372402
(self.client.request.headers.get('Accept'),
23382403
", ".join(sorted(
23392404
self.__accepted_content_type.keys())))))
@@ -2597,14 +2662,20 @@ def format_dispatch_output(self, accept_mime_type, output,
25972662

25982663
output = '<?xml version="1.0" encoding="UTF-8" ?>\n' + \
25992664
b2s(dicttoxml(output, root=False))
2665+
elif accept_mime_type:
2666+
self.client.setHeader("Content-Type", accept_mime_type)
2667+
# do not send etag when getting binary_content. The ETag
2668+
# is for the item not the content of the item. So the ETag
2669+
# can change even though the content is the same. Since
2670+
# content is immutable by default, the client shouldn't
2671+
# need the etag for writing.
2672+
self.client.setHeader("ETag", None)
2673+
return output['data']['data']
26002674
else:
2601-
# FIXME?? consider moving this earlier. We should
2602-
# error out before doing any work if we can't
2603-
# display acceptable output.
2604-
self.client.response_code = 406
2605-
output = ("Requested content type '%s' is not available.\n"
2606-
"Acceptable types: %s" % (accept_mime_type,
2607-
", ".join(sorted(self.__accepted_content_type.keys()))))
2675+
self.client.response_code = 500
2676+
output = _("Internal error while formatting response.\n"
2677+
"accept_mime_type is not defined. This should\n"
2678+
"never happen\n")
26082679

26092680
# Make output json end in a newline to
26102681
# separate from following text in logs etc..

0 commit comments

Comments
 (0)