Skip to content

Commit f7feda6

Browse files
Some porting advice from Joseph Myers.
1 parent fd4bbab commit f7feda6

File tree

1 file changed

+58
-0
lines changed

1 file changed

+58
-0
lines changed

2to3-done.txt

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -150,3 +150,61 @@ NOTHING DONE
150150
./roundup/cgi/__init__.py
151151
./roundup/cgi/apache.py
152152
./roundup/cgi/client.py
153+
154+
Joseph S. Myers notes:
155+
>The key difficulty is undoubtedly dealing with the changes to string types
156+
>- combined with how the extensibility of Roundup means people will have
157+
>Python code in their instances (detectors, etc.), both directly and
158+
>embedded in HTML - which passes strings to Roundup interfaces and gets
159+
>strings from Roundup interfaces.
160+
>
161+
>Roundup makes heavy use of string objects that really are text strings -
162+
>logically, sequences of Unicode code points. Right now, those strings,
163+
>with Python 2, are str objects, encoded in UTF-8. This means that
164+
>people's Python code in their instances, running under Python 2, will
165+
>expect str objects encoded in UTF-8 (and if their code is e.g. generating
166+
>HTML text encoded in UTF-8 to be sent to the user, it never actually has
167+
>to deal with the encoding explicitly, just passes the text through).
168+
>(The experimental Jinja2 templating engine then explicitly converts those
169+
>UTF-8 encoded str objects to unicode objects because that's what Jinja2
170+
>expects to deal with.)
171+
>
172+
>It's quite plausible people's code in their instances will work fine with
173+
>Python 3 if it gets str objects for both Python 2 and Python 3 (UTF-8
174+
>encoded str for Python 2, ordinary Unicode string objects for Python 3).
175+
>It's more likely to break if it gets Python 2 unicode objects, although
176+
>using such objects in Python 2 seems to be how a lot of people do their
177+
>porting to Python 3. And certainly if when an instance is running with
178+
>Python 3, it gets an object that's not a native sequence of Unicode code
179+
>points, but has each UTF-8 byte as a separate element of the str object,
180+
>things will break.
181+
>
182+
>(I have an instance that uses Unicode collation via PyICU on data from
183+
>Roundup, for example. That works fine with UTF-8 str objects in Python 2,
184+
>would work fine with Python 2 unicode objects though I don't use those,
185+
>works fine with Python 3 str objects when used in their native way - the
186+
>same code has a large part also used outside of Roundup that works with
187+
>both Python 2 and Python 3. Actually, I'd like to have a way to make
188+
>Roundup's built-in sorting of database objects use Unicode collation, or
189+
>otherwise have a way of computing a sort key that isn't simply naming a
190+
>particular property as the sort key, but that's another matter.)
191+
>
192+
>But Roundup *also* has strings that are sequences of bytes - String()
193+
>database fields, which can be both. Many are data displayed directly on
194+
>web pages and edited there by the user - those are ordinary strings (UTF-8
195+
>at present). But FileClass objects have a String() content property which
196+
>is arbitrary binary data such as an attached file - which logically should
197+
>appear to the user as a bytes object in Python 3. Except that some
198+
>FileClass objects use that data to store text (e.g. the msg class in the
199+
>classic scheme). So you definitely need a Bytes() alternative to String()
200+
>fields, for binary data, and may or may not also need separate text and
201+
>binary variants of FileClass.
202+
>
203+
>I've found that for text-heavy code, always using str objects for text and
204+
>having them be normal Unicode strings in Python 3 but UTF-8-encoded in
205+
>Python 2 works well with the vast bulk of code being encoding-agnostic and
206+
>just passing the strings around. Obviously things are different for the
207+
>sort of code that mixes text and binary data - that is, the sort of thing
208+
>you describe as systems programs in your porting HOWTO. I don't think
209+
>Roundup really is such a systems program, except in limited areas such as
210+
>dealing with attached files.

0 commit comments

Comments
 (0)