Skip to content

Commit 6412d1e

Browse files
feat: blobstore-driven meeting materials (ietf-tools#9780)
* feat: meeting materials blob resolver API (ietf-tools#9700) * refactor: exclude_deleted() for StoredObject queryset * chore: comment * feat: meeting materials blob resolver API * feat: materials blob retrieval API (ietf-tools#9728) * feat: materials blob retrieval API (WIP) * refactor: alphabetize ARTIFACT_STORAGE_NAMES * chore: limit buckets served * refactor: any-meeting option in _get_materials_doc() * feat: create missing blobs on retrieval * feat: render HTML from markdown via API (ietf-tools#9729) * chore: add comment * fix: allow bluesheets to be retrieved Normally not retrieved through /meeting/materials, but they're close enough in purpose that we might as well make them available. * fix: only update StoredObject.modified if changed * fix: preserve mtime when creating blob * refactor: better exception name * feat: render .md.html from .md blob * fix: explicit STATIC_IETF_ORG value in template Django's context_processors are not applied to render_string calls as we use them here, so settings are not available. * fix: typo * fix: decode utf-8 properly * feat: use filesystem to render .md.html * fix: copy/paste error in api_resolve_materials_name * refactor: get actual rev in _get_materials_doc (ietf-tools#9741) * fix: return filename, not full path * feat: precompute blob lookups for meeting materials (ietf-tools#9746) * feat: ResolvedMaterial model + migration * feat: method to populate ResolvedMaterial (WIP) * refactor: don't delete ResolvedMaterials Instead of deleting the ResolvedMaterials for a meeting, which might lose updates made during processing, update existing rows with any changes and warn if anything changed during the process. * fix: fix _get_materials_doc() Did not handle the possibility of multiple DocHistory objects with the same rev. * refactor: factor out material lookup helper * feat: resolve blobs via blobdb/fs for cache * chore: add resource * feat: admin for ResolvedMaterial * feat: cache-driven resolve materials API * fix: add all ResolvedMaterials; var names * fix: handle null case * feat: resolve_meeting_materials_task * feat: update resolver cache on material upload (ietf-tools#9759) * feat: robustness + date range for resolve materials task (ietf-tools#9760) * fix: limit types added to ResolvedMaterial * feat: resolve meeting materials in order by date * feat: add meetings_until param * fix: log&continue if resolving fails on a meeting * feat: log error message on parse errors * refactor: move ResolvedMaterial to blobdb app (ietf-tools#9762) * refactor: move ResolvedMaterial to blobdb app * fix: undo accidental removal * chore: fix lint (ietf-tools#9767) * fix: don't use DocHistory to find materials (ietf-tools#9771) * fix: don't use DocHistory to validate revs The DocHistory records are incomplete and, in particular, -00 revs are often missing. * Revert "refactor: get actual rev in _get_materials_doc (ietf-tools#9741)" This reverts commit 7fd1580 * chore: remove the on-demand resolver api * chore: fix lint * feat: populate materials buckets (ietf-tools#9777) * refactor: drop .txt from filename_with_rev() * feat: utilities to populate materials blobs * feat: store materials for a full meeting as blobs Plus a bunch of fixup from working with real data. (Based on meetings 71, 83, and 118, picked arbitrarily) * chore: update migration * feat: task to store materials in blobdb * refactor: reimplement api_retrieve_materials_blob * fix: update resolving task, fix bugs * Revert "refactor: drop .txt from filename_with_rev()" This reverts commit a849d0f. * chore: fix lint --------- Co-authored-by: Robert Sparks <rjsparks@nostrum.com>
1 parent d27f08e commit 6412d1e

14 files changed

Lines changed: 798 additions & 44 deletions

File tree

ietf/api/urls.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,9 @@
4949
url(r'^group/role-holder-addresses/$', api_views.role_holder_addresses),
5050
# Let IESG members set positions programmatically
5151
url(r'^iesg/position', views_ballot.api_set_position),
52+
# Find the blob to store for a given materials document path
53+
url(r'^meeting/(?:(?P<num>(?:interim-)?[a-z0-9-]+)/)?materials/%(document)s(?P<ext>\.[A-Za-z0-9]+)?/resolve-cached/$' % settings.URL_REGEXPS, meeting_views.api_resolve_materials_name_cached),
54+
url(r'^meeting/blob/(?P<bucket>[a-z0-9-]+)/(?P<name>[a-z][a-z0-9.-]+)$', meeting_views.api_retrieve_materials_blob),
5255
# Let Meetecho set session video URLs
5356
url(r'^meeting/session/video/url$', meeting_views.api_set_session_video_url),
5457
# Let Meetecho tell us the name of its recordings

ietf/blobdb/admin.py

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
from django.db.models.functions import Length
44
from rangefilter.filters import DateRangeQuickSelectListFilterBuilder
55

6-
from .models import Blob
6+
from .models import Blob, ResolvedMaterial
77

88

99
@admin.register(Blob)
@@ -29,3 +29,12 @@ def get_queryset(self, request):
2929
def object_size(self, instance):
3030
"""Get the size of the object"""
3131
return instance.object_size # annotation added in get_queryset()
32+
33+
34+
@admin.register(ResolvedMaterial)
35+
class ResolvedMaterialAdmin(admin.ModelAdmin):
36+
model = ResolvedMaterial
37+
list_display = ["name", "meeting_number", "bucket", "blob"]
38+
list_filter = ["meeting_number", "bucket"]
39+
search_fields = ["name", "blob"]
40+
ordering = ["name"]
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
# Copyright The IETF Trust 2025, All Rights Reserved
2+
3+
from django.db import migrations, models
4+
5+
6+
class Migration(migrations.Migration):
7+
8+
dependencies = [
9+
("blobdb", "0001_initial"),
10+
]
11+
12+
operations = [
13+
migrations.CreateModel(
14+
name="ResolvedMaterial",
15+
fields=[
16+
(
17+
"id",
18+
models.BigAutoField(
19+
auto_created=True,
20+
primary_key=True,
21+
serialize=False,
22+
verbose_name="ID",
23+
),
24+
),
25+
("name", models.CharField(help_text="Name to resolve", max_length=300)),
26+
(
27+
"meeting_number",
28+
models.CharField(
29+
help_text="Meeting material is related to", max_length=64
30+
),
31+
),
32+
(
33+
"bucket",
34+
models.CharField(help_text="Resolved bucket name", max_length=255),
35+
),
36+
(
37+
"blob",
38+
models.CharField(help_text="Resolved blob name", max_length=300),
39+
),
40+
],
41+
),
42+
migrations.AddConstraint(
43+
model_name="resolvedmaterial",
44+
constraint=models.UniqueConstraint(
45+
fields=("name", "meeting_number"), name="unique_name_per_meeting"
46+
),
47+
),
48+
]

ietf/blobdb/models.py

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,3 +96,23 @@ def _emit_blob_change_event(self, using=None):
9696
),
9797
using=using,
9898
)
99+
100+
101+
class ResolvedMaterial(models.Model):
102+
# A Document name can be 255 characters; allow this name to be a bit longer
103+
name = models.CharField(max_length=300, help_text="Name to resolve")
104+
meeting_number = models.CharField(
105+
max_length=64, help_text="Meeting material is related to"
106+
)
107+
bucket = models.CharField(max_length=255, help_text="Resolved bucket name")
108+
blob = models.CharField(max_length=300, help_text="Resolved blob name")
109+
110+
class Meta:
111+
constraints = [
112+
models.UniqueConstraint(
113+
fields=["name", "meeting_number"], name="unique_name_per_meeting"
114+
)
115+
]
116+
117+
def __str__(self):
118+
return f"{self.name}@{self.meeting_number} -> {self.bucket}:{self.blob}"

ietf/doc/models.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -913,6 +913,7 @@ def role_for_doc(self):
913913
roles.append('Action Holder')
914914
return ', '.join(roles)
915915

916+
# N.B., at least a couple dozen documents exist that do not satisfy this validator
916917
validate_docname = RegexValidator(
917918
r'^[-a-z0-9]+$',
918919
"Provide a valid document name consisting of lowercase letters, numbers and hyphens.",
@@ -1588,9 +1589,17 @@ class BofreqResponsibleDocEvent(DocEvent):
15881589
""" Capture the responsible leadership (IAB and IESG members) for a BOF Request """
15891590
responsible = models.ManyToManyField('person.Person', blank=True)
15901591

1592+
1593+
class StoredObjectQuerySet(models.QuerySet):
1594+
def exclude_deleted(self):
1595+
return self.filter(deleted__isnull=True)
1596+
1597+
15911598
class StoredObject(models.Model):
15921599
"""Hold metadata about objects placed in object storage"""
15931600

1601+
objects = StoredObjectQuerySet.as_manager()
1602+
15941603
store = models.CharField(max_length=256)
15951604
name = models.CharField(max_length=1024, null=False, blank=False) # N.B. the 1024 limit on name comes from S3
15961605
sha384 = models.CharField(max_length=96)

ietf/doc/storage.py

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ def __init__(self, file, name, mtime=None, content_type="", store=None, doc_name
3232
@classmethod
3333
def from_storedobject(cls, file, name, store):
3434
"""Alternate constructor for objects that already exist in the StoredObject table"""
35-
stored_object = StoredObject.objects.filter(store=store, name=name, deleted__isnull=True).first()
35+
stored_object = StoredObject.objects.exclude_deleted().filter(store=store, name=name).first()
3636
if stored_object is None:
3737
raise FileNotFoundError(f"StoredObject for {store}:{name} does not exist or was deleted")
3838
file = cls(file, name, store, doc_name=stored_object.doc_name, doc_rev=stored_object.doc_rev)
@@ -140,7 +140,11 @@ def _save_stored_object(self, name, content) -> StoredObject:
140140
),
141141
),
142142
)
143-
if not created:
143+
if not created and (
144+
record.sha384 != content.custom_metadata["sha384"]
145+
or record.len != int(content.custom_metadata["len"])
146+
or record.deleted is not None
147+
):
144148
record.sha384 = content.custom_metadata["sha384"]
145149
record.len = int(content.custom_metadata["len"])
146150
record.modified = now
@@ -160,7 +164,7 @@ def _delete_stored_object(self, name) -> Optional[StoredObject]:
160164
else:
161165
now = timezone.now()
162166
# Note that existing_record is a queryset that will have one matching object
163-
existing_record.filter(deleted__isnull=True).update(deleted=now)
167+
existing_record.exclude_deleted().update(deleted=now)
164168
return existing_record.first()
165169

166170
def _save(self, name, content):

ietf/doc/storage_utils.py

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,14 @@
1212
from ietf.utils.log import log
1313

1414

15+
class StorageUtilsError(Exception):
16+
pass
17+
18+
19+
class AlreadyExistsError(StorageUtilsError):
20+
pass
21+
22+
1523
def _get_storage(kind: str) -> Storage:
1624
if kind in settings.ARTIFACT_STORAGE_NAMES:
1725
return storages[kind]
@@ -70,7 +78,7 @@ def store_file(
7078
# debug.show('f"Asked to store {name} in {kind}: is_new={is_new}, allow_overwrite={allow_overwrite}"')
7179
if not allow_overwrite and not is_new:
7280
debug.show('f"Failed to save {kind}:{name} - name already exists in store"')
73-
raise RuntimeError(f"Failed to save {kind}:{name} - name already exists in store")
81+
raise AlreadyExistsError(f"Failed to save {kind}:{name} - name already exists in store")
7482
new_name = _get_storage(kind).save(
7583
name,
7684
StoredObjectFile(
@@ -85,7 +93,7 @@ def store_file(
8593
if new_name != name:
8694
complaint = f"Error encountered saving '{name}' - results stored in '{new_name}' instead."
8795
debug.show("complaint")
88-
raise RuntimeError(complaint)
96+
raise StorageUtilsError(complaint)
8997
except Exception as err:
9098
log(f"Blobstore Error: Failed to store file {kind}:{name}: {repr(err)}")
9199
if settings.SERVER_MODE == "development":

ietf/doc/views_material.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
from ietf.doc.utils import add_state_change_event, check_common_doc_name_rules
2323
from ietf.group.models import Group
2424
from ietf.group.utils import can_manage_materials
25+
from ietf.meeting.utils import resolve_uploaded_material
2526
from ietf.utils import log
2627
from ietf.utils.decorators import ignore_view_kwargs
2728
from ietf.utils.meetecho import MeetechoAPIError, SlidesManager
@@ -179,6 +180,9 @@ def edit_material(request, name=None, acronym=None, action=None, doc_type=None):
179180
"There was an error creating a hardlink at %s pointing to %s: %s"
180181
% (ftp_filepath, filepath, ex)
181182
)
183+
else:
184+
for meeting in set([s.meeting for s in doc.session_set.all()]):
185+
resolve_uploaded_material(meeting=meeting, doc=doc)
182186

183187
if prev_rev != doc.rev:
184188
e = NewRevisionDocEvent(type="new_revision", doc=doc, rev=doc.rev)

ietf/meeting/resources.py

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,11 +11,15 @@
1111

1212
from ietf import api
1313

14-
from ietf.meeting.models import ( Meeting, ResourceAssociation, Constraint, Room, Schedule, Session,
15-
TimeSlot, SchedTimeSessAssignment, SessionPresentation, FloorPlan,
16-
UrlResource, ImportantDate, SlideSubmission, SchedulingEvent,
17-
BusinessConstraint, ProceedingsMaterial, MeetingHost, Attended,
18-
Registration, RegistrationTicket)
14+
from ietf.meeting.models import (Meeting, ResourceAssociation, Constraint, Room,
15+
Schedule, Session,
16+
TimeSlot, SchedTimeSessAssignment, SessionPresentation,
17+
FloorPlan,
18+
UrlResource, ImportantDate, SlideSubmission,
19+
SchedulingEvent,
20+
BusinessConstraint, ProceedingsMaterial, MeetingHost,
21+
Attended,
22+
Registration, RegistrationTicket)
1923

2024
from ietf.name.resources import MeetingTypeNameResource
2125
class MeetingResource(ModelResource):

ietf/meeting/tasks.py

Lines changed: 129 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,20 @@
1-
# Copyright The IETF Trust 2024, All Rights Reserved
1+
# Copyright The IETF Trust 2024-2025, All Rights Reserved
22
#
33
# Celery task definitions
44
#
5+
import datetime
6+
57
from celery import shared_task
8+
# from django.db.models import QuerySet
69
from django.utils import timezone
710

811
from ietf.utils import log
912
from .models import Meeting
10-
from .utils import generate_proceedings_content
13+
from .utils import (
14+
generate_proceedings_content,
15+
resolve_materials_for_one_meeting,
16+
store_blobs_for_one_meeting,
17+
)
1118
from .views import generate_agenda_data
1219
from .utils import fetch_attendance_from_meetings
1320

@@ -61,3 +68,123 @@ def fetch_meeting_attendance_task():
6168
meeting_stats['processed']
6269
)
6370
)
71+
72+
73+
def _select_meetings(
74+
meetings: list[str] | None = None,
75+
meetings_since: str | None = None,
76+
meetings_until: str | None = None
77+
): # nyah
78+
"""Select meetings by number or date range"""
79+
# IETF-1 = 1986-01-16
80+
EARLIEST_MEETING_DATE = datetime.datetime(1986, 1, 1)
81+
meetings_since_dt: datetime.datetime | None = None
82+
meetings_until_dt: datetime.datetime | None = None
83+
84+
if meetings_since == "zero":
85+
meetings_since_dt = EARLIEST_MEETING_DATE
86+
elif meetings_since is not None:
87+
try:
88+
meetings_since_dt = datetime.datetime.fromisoformat(meetings_since)
89+
except ValueError:
90+
log.log(
91+
"Failed to parse meetings_since='{meetings_since}' with fromisoformat"
92+
)
93+
raise
94+
95+
if meetings_until is not None:
96+
try:
97+
meetings_until_dt = datetime.datetime.fromisoformat(meetings_until)
98+
except ValueError:
99+
log.log(
100+
"Failed to parse meetings_until='{meetings_until}' with fromisoformat"
101+
)
102+
raise
103+
if meetings_since_dt is None:
104+
# if we only got meetings_until, start from the first meeting
105+
meetings_since_dt = EARLIEST_MEETING_DATE
106+
107+
if meetings is None:
108+
if meetings_since_dt is None:
109+
log.log("No meetings requested, doing nothing.")
110+
return Meeting.objects.none()
111+
meetings_qs = Meeting.objects.filter(date__gte=meetings_since_dt)
112+
if meetings_until_dt is not None:
113+
meetings_qs = meetings_qs.filter(date__lte=meetings_until_dt)
114+
log.log(
115+
"Selecting meetings between "
116+
f"{meetings_since_dt} and {meetings_until_dt}"
117+
)
118+
else:
119+
log.log(f"Selecting meetings since {meetings_since_dt}")
120+
else:
121+
if meetings_since_dt is not None:
122+
log.log(
123+
"Ignoring meetings_since and meetings_until "
124+
"because specific meetings were requested."
125+
)
126+
meetings_qs = Meeting.objects.filter(number__in=meetings)
127+
return meetings_qs
128+
129+
130+
@shared_task
131+
def resolve_meeting_materials_task(
132+
*, # only allow kw arguments
133+
meetings: list[str] | None=None,
134+
meetings_since: str | None=None,
135+
meetings_until: str | None=None
136+
):
137+
"""Run materials resolver on meetings
138+
139+
Can request a set of meetings by number by passing a list in the meetings arg, or
140+
by range by passing an iso-format timestamps in meetings_since / meetings_until.
141+
To select all meetings, set meetings_since="zero" and omit other parameters.
142+
"""
143+
meetings_qs = _select_meetings(meetings, meetings_since, meetings_until)
144+
for meeting in meetings_qs.order_by("date"):
145+
log.log(
146+
f"Resolving materials for {meeting.type_id} "
147+
f"meeting {meeting.number} ({meeting.date})..."
148+
)
149+
mark = timezone.now()
150+
try:
151+
resolve_materials_for_one_meeting(meeting)
152+
except Exception as err:
153+
log.log(
154+
"Exception raised while resolving materials for "
155+
f"meeting {meeting.number}: {err}"
156+
)
157+
else:
158+
log.log(f"Resolved in {(timezone.now() - mark).total_seconds():0.3f} seconds.")
159+
160+
161+
@shared_task
162+
def store_meeting_materials_as_blobs_task(
163+
*, # only allow kw arguments
164+
meetings: list[str] | None = None,
165+
meetings_since: str | None = None,
166+
meetings_until: str | None = None
167+
):
168+
"""Push meeting materials into the blob store
169+
170+
Can request a set of meetings by number by passing a list in the meetings arg, or
171+
by range by passing an iso-format timestamps in meetings_since / meetings_until.
172+
To select all meetings, set meetings_since="zero" and omit other parameters.
173+
"""
174+
meetings_qs = _select_meetings(meetings, meetings_since, meetings_until)
175+
for meeting in meetings_qs.order_by("date"):
176+
log.log(
177+
f"Creating blobs for materials for {meeting.type_id} "
178+
f"meeting {meeting.number} ({meeting.date})..."
179+
)
180+
mark = timezone.now()
181+
try:
182+
store_blobs_for_one_meeting(meeting)
183+
except Exception as err:
184+
log.log(
185+
"Exception raised while creating blobs for "
186+
f"meeting {meeting.number}: {err}"
187+
)
188+
else:
189+
log.log(
190+
f"Blobs created in {(timezone.now() - mark).total_seconds():0.3f} seconds.")

0 commit comments

Comments
 (0)