Monday 16 December 2019

python - How to find a non-ascii byte in my code?



While making my App Engine app I suddenly ran into an error which shows every couple of requests:



    run_wsgi_app(application)
File "/home/ubuntu/Programs/google/google_appengine/google/appengine/ext/webapp/util.py", line 98, in run_wsgi_app

run_bare_wsgi_app(add_wsgi_middleware(application))
File "/home/ubuntu/Programs/google/google_appengine/google/appengine/ext/webapp/util.py", line 118, in run_bare_wsgi_app
for data in result:
File "/home/ubuntu/Programs/google/google_appengine/google/appengine/ext/appstats/recording.py", line 897, in appstats_wsgi_wrapper
result = app(environ, appstats_start_response)
File "/home/ubuntu/Programs/google/google_appengine/google/appengine/ext/webapp/_webapp25.py", line 717, in __call__
handler.handle_exception(e, self.__debug)
File "/home/ubuntu/Programs/google/google_appengine/google/appengine/ext/webapp/_webapp25.py", line 463, in handle_exception
self.error(500)
File "/home/ubuntu/Programs/google/google_appengine/google/appengine/ext/webapp/_webapp25.py", line 436, in error

self.response.clear()
File "/home/ubuntu/Programs/google/google_appengine/google/appengine/ext/webapp/_webapp25.py", line 288, in clear
self.out.seek(0)
File "/usr/lib/python2.7/StringIO.py", line 106, in seek
self.buf += ''.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd7 in position 208: ordinal not in range(128)


I really have no idea where this could be, it only happens when I use a specific function but it's impossible to track all string I have.
It's possible this byte is a character like ' " [ ] etc, but only in another language




How can I find this byte and possibly other ones?



I am running GAE with python 2.7 in ubuntu 11.04



Thanks.



*updated*



This is the code I ended up using:

from codecs import BOM_UTF8
from os import listdir, path
p = "path"



def loopPath(p, times=0):
for fname in listdir(p):
filePath = path.join(p, fname)
if path.isdir(filePath):
return loopPath(filePath, times+1)


if fname.split('.', 1)[1] != 'py': continue

f = open(filePath, 'r')
ln = 0
for line in f:
#print line[:3] == BOM_UTF8
if not ln and line[:3] == BOM_UTF8:
line = line[4:]
col = 0
for c in list(line):

if ord(c) > 128:
raise Exception('Found "'+line[c]+'" line %d column %d in %s' % (ln+1, col, filePath))
col += 1
ln += 1
f.close()

loopPath(p)

Answer



Just goes through every character in each line of code. Something like that:




# -*- coding: utf-8 -*-
import sys

data = open(sys.argv[1])
line = 0
for l in data:
line += 1
char = 0
for s in list(unicode(l,'utf-8')):

char += 1
try:
s.encode('ascii')
except:
print 'Non ASCII character at line:%s char:%s' % (line,char)

No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file

I want to output an inline jpg image as a base64 encoded string, however when I do this : $contents = file_get_contents($filename); print &q...