Welcome! 欢迎光临!

Research and Projects on the top exhibit what I am doing or what I have done. In "What Am I Thinking" on your right, you can read my latest tweets. The "Topics" on the right and the "My Role" on the top categorize my posts by the topics or the roles I involved for the issues. Most posts on this website are in both English and Chinese.

在上方导航条中的研究项目中,您可以查看我正在做什么;在右侧的“What Am I Thinking”中,您可以阅读我最新的推文;右侧的“Topics”和上方的“My Roles”则将我的日志按照题目和我在里面涉及的角色进行归类。本站大部分日志使用中、英文双语。


Almost two years ago, I've announced a small tool called Course Schedule Exporter, which exports the course schedule as ical format from BIT's internal banner system, based on a formerly private tool developed by myself to do such kind of exporting work. To bring it online, I've started a semi-private project called DPRBIT to bridge into the intranet of Beijing Institute of Technology so that our tool can access the data residing in the banner within the intranet.

DPRBIT provided an HTTP proxy as the interface to users. When a users tried to access any address via this proxy, our system would redirect the user to our authentication portal to verify the identity via the Central Authentication Service (CAS) maintained by the Network Service Center of BIT. After finishing up the authentication, the user would be granted to use the tunnel to access the intranet for 30 minutes. Course Schedule Exporter was based on this infrastructure with some minor exemptions mostly on the authentication part.

Both of them worked well. DPRBIT helped a number of exchange/study-abroad students that year to access the intranet to finish up their compulsory capstone project filings, as the university's VPN was authenticated based on the billing status of the student's campus Internet account, which is ridiculous as those study-abroad students would not be on-campus to use the campus Internet let alone paying the monthly fees.

These systems have been there for two years, while the public interface for DPRBIT had been taken offline shortly after a port scanning activity from our server reported to our ISP, which was suspected to be due to any potential backdoor residing in the front-tier of DPRBIT.

Recently, our link to BIT's intranet went off presumably due to a network topology change at BIT, and the CAS maintained by the Network Service Center is no longer open to any developer but restricted only to their "official" services without any other possibility to gain the access to the CAS. Since these two services have been running for two years and some of the designs have been proved to be lack of scalability, I've decided to terminate these two services.

The user interfaces to them are still left running, but the core services have been taken down permanently. I am happy to resume these services with any current BIT student who is interested in this project.


I've recently been working on the installation of a bio software called Whole-Genome Shotgun Assembler (wgs-assembler) on the cluster I am working with. This error took us a while to fix, so I am writing the details here hopefully to help other system administrators on installing this software.

As most research software does, wgs contains a bunch of bash scripts to call all the different bio software programs (we call as dependencies) to finish the processing on different stages. Research software projects are generally poorly documented, so it is very hard to figure out what is going on if any error message occurs. There is a tool script called PacBio Corrected Reads (PBcR) which yields the following error message on the user's execution:

ERROR: Overlap prep job /users/**/tempsciara/1-overlapper/correct_reads_part 1
ERROR: Overlap prep job /users/**/tempsciara/1-overlapper/correct_reads_part 2
8 overlap partitioning jobs failed.

This issue is hard to deal with because I couldn't reproduce it on my end, since the exact same command works on my end. There were apparently some funny things happening that were different on my run from the user's run. Reproduction is important in this case, as there is not much documentation to reference for research software and the error message didn't provide very useful information. The best way to deal with such kind of ad-hoc scripts is to debug the script which causes the trouble. By using debugger or simply tentatively modifying the script to get more information, we could figure out the behaviors of the scripts and then the direct causes of the failures.

Finally, we made to reproduce the error on our end, and it turned out that the script is hard-coding an absolute path for an external command `/usr/bin/time`. It is understandable in this case because `time` is a basic utility command distributed via core-utils package. For most Linux users, they should have `time` installed on their default executable path `/usr/bin`. However, the problem for clusters is that a clusters normally need to host as many compute nodes as possible. Putting too much thing in the core (i.e. storing too much time locally) would really obstruct this goal and slow down the system. Therefore, in our case, we installed some non-essential utilities somewhere else accessible via NFS. The user was running the job on a compute node that does not have `time` installed at `/usr/bin`, while we were attempting to reproduce on a fully-functional master node which has the command installed locally. The script did write the error message as the result from a problem called at an intermediate step, but it was not documented anywhere, so we couldn't only figure it out until going through the source code.

The fix for this issue would be replacing all `/usr/bin/time` in PBcR (or PBcR.pl in the installation source code) to `/usr/bin/env time`. A bug report has been created to the author.

In summary, from this issue, we can see the efficient steps to deal with the error of research software are:

  1. reproduce the errors;
  2. go through the source code and debug the code to figure out what those error messages mean and what causes the errors.

Another takeaway is that script software developers should not pose assumptions on the path of any command. `/usr/bin/env` is a great tool to help figure out the path of a desired command.


The Monkey has recently reinstalled the operating system on his personal laptop to deal with some resource depletion. After the re-installation, he was not able to utilize the SkyDrive feature builtin Microsoft OneNote 2010.

OneNote firstly complained about WebDAV, and later on, after a reboot, the software turned to show the following message at File->Open menu:

This service is disabled by policy. Contact your system administrator to enable it.

The Monkey started the WebDAV service and tried to edit the WebIntegratedEnabled and DisableSkydriveSetupOnFirstBoot in the registry as suggested by a post, but none of them worked out. He installed the "spamy" Microsoft SkyDrive client, which didn't resolve the OneNote issue at all.

Driven by the most naive idea, he decided to reinstall Microsoft Office, and it didn't help too much. However, this time the "policy" error message on OneNote disappeared, but he still could not open OneNote even through the "Launch in OneNote" links on the web UI of SkyDrive.

The SkyDrive client complained about "Internet Connectivity", but the Monkey is sure that the Internet was on and the firewalls were set to be open to the corresponding software, and his roommate was not Skyping or playing video games. Finally, it turned out to be a configuration issue. So, SkyDrive integration requires TLS to establish connections. However, for some reasons, some Windows installations have disabled TLS supports by default, which causes the weird connectivity issue.

The solution is to select the options to allow TLS under the Advanced tab in Internet Settings on the Control Panel. The problem resolved immediately when the Monkey made this change. Since he couldn't find any direct information about this issue related to OneNote, we are posting this report, and hopefully it would help others. The Monkey has denied to comment on this bad design though.


Recently, I am learning Web security issues. Cross-Site Request Forgery (CSRF) forges users behaviors to utilize stored Cookies to act as the users or steal information.

The same-origin-policies (SOPs) is a mechanism enforced by browsers to protect their users. However, I find that it is hard to get a clear English description of SOPs behaviors as well as what SOPs can give and what it cannot give. After struggling couples of hours on this mechanism, I am trying to put my understand here to request for comments or corrections.

Scope: Script-Initiated Requests

SOPs is only enforced on the requests initiated by client-side scripts. Therefore, those CSRFs initiated by resource references, such as loading an "image" or an iframe will not be enforced on this rule. That is also the reason why we can reference arbitrary images from arbitrary external websites on our websites.

Browser: Enforcement after Requesting

For each script-initiated request, the browser will request to the server. However, before delivering the response back to the initiator, the browser will check if this request violates the same-origin policies. If it violates, the browser will not deliver back the information to the script. Note that the request is actually made by the browser to the server.

Why browsers cannot stop those requests before sending them out? Because we may also want to allow someone to do such kind of script-initiated requests (for example, at client-side API calls), which is called as cross-origin resource sharing (CORS). When a server responds the request, it includes headers like Access-Control-Allow-Origin to tell browsers which origins the server allows. This information can only be retrieved after making such request. That's why browsers have to request the "suspicious" requests.

SOPs: Avoid Information Leakage

So, what is protected then? When I first time got this formal name of CSRF, I was very anxious about someone making HTTP requests on my behalf. However, now we see that the browser does not stop this.

Well, the browser is actually protecting us from information leakage. Whether the request is initiated by a resource reference or by a script without a correct response due to browser's SOPs, the malicious script does not get any information from the other website. So, it is defending us from utilizing the Cookies stored in the browser to do unauthorized access.

Server: Origin Headers and CSRF Tokens

So, what can protect us from making unauthorized requests? As a part of the CORS standard, all state-modifying requests (POST/DELETE methods...) should include a Origin header for the server. The server then can determine if it should execute the request with its accepted origin lists on the server-side.

However, in the real Internet, state-modifying happens also in GET requests. The standard does not require browsers to send Origin header to the server, so the server needs a way to protect itself. There are multiple options, such as adding CAPTCHA or checking Referrer. Unfortunately, none of them solves the problem without changing the user behaviors.

SOPs + CSRF Tokens: Avoid Unauthorized Requests

CSRF Tokens solves this issue by creating in a preliminary page a token which won't be stored in the browser. A malicious script cannot get any information from the victim website due to SOPs, so it cannot get the token by requesting the preliminary page. Therefore, it cannot pass the CSRF token checking on the server-side.

Now, we see that the browser protects users in their privacy aspects, while the server protects users from unauthorized request.

Remark: I am sure that there should be inaccuracy or mistakes, as CSRF is not my expertise. Please correct me if there is anything wrong.


Goodbye, the year of 2014! I still remember the time when I was celebrating the Chinese New Year in Boston during my exchange program. All my memory seems still fresh, but everything has become a part of the history.

The year of 2014 is quite special to me. I was a state-sponsored exchange student from Beijing Institute of Technology at Northeastern University studying Electrical Engineering at the same time in 2014. Now,  I am a graduate student at Brown University studying my favorite major -- Computer Science. My life has changed dramatically, believe or not.

The year of 2014 also witnessed the launch of Chang'e III and the missing Malaysian 370 Flight. Those were my first times to feel like being with my homeland. My friend commented me as never feeling such patriotic before leaving the home country. It is true. Homeland is our comfort zone, where we have friends, family, and familiar metropolitan systems.

I was experiencing the pain of changing, since I was trying to jump out of my comfortable zone. I changed my major as well as the school of my study, while I decided to terminate my BEng/PhD track and was forced by some arrogant idiots to give up my fast-track graduate school offers in my home country's higher education system.

I chose to challenge myself to try a more competitive track of my life. No matter what, I will continue it. I believe it is always good to have more opportunities. (at Beijing Capital International Airport)







As promised, I am posting my graduate study resolutions. It's actually not possible to design everything beforehand, so this would act as a guidance of my study in the coming two years.

Time & Efficiency

  • Do work purposely all the time. Always know what I want when doing something.
  • Review after class, and check the references. Make sure to understand all the covered things.
  • Keep a regular schedule, and avoid staying up late.

Behavior & Social

  • Stop mentioning or thinking the past anymore. Think for and look to the future.
  • Be helpful to friends. Make couples of close friends here.
  • Never get angry due to other's behaviors which are not related to me.


  • Be happy and cheerful everyday.
  • Don't be shy. Be proactive.
  • Take part in social activities.


  • Keep a budget, and report my expense monthly.
  • Find a part-time on-campus job.
  • Cook by myself at least twice a week.
  • Workout/jogging/swimming regularly.
  • Read/study interesting topics always. Reading at library at least twice a week.


  • Fill up my gap in CS at graphics, compilers, AI, ...
  • Participate in the research groups in NLP/AI, Data, and Networks.
  • Do a technical internship and a research internship.
  • Leave out time for reading about the latest innovations.

I have been a lazy guy for updating my blog, and I apologize for it. If you take a look at the posts I've posted, you will see how infrequently I post.

It is not in purpose. Some of friends may have already noticed my upset when I came back to Beijing. In the recent year, I have been in troubles related to the life of studying abroad, the bureaucracy in BIT, friendships, the relationship, graduate school admission, and my family.

I think I am probably the kind of person who has very strict requirements on and expectations to myself. I really want to work hard, but meanwhile, I also expect the great outcome afterward. However, hard-working may not always produce good results (or saying expected results). I should have understood this thing.

I had been very alert to the illusions accompanied with studying in a self-capsuled college, but keeping a good sense to myself for four years is relatively hard. For now, I don't want to express any regret to anything which has been done or has not got chance to be done. As I posted on social networks couples days before, just let the past go, no matter how unclear I am feeling about my future, since it's the time to cheer up to start the new journey.

Yes, I am in a kind of strange state now. Academically, I used to be one of the best Electrical Engineering students in the Bachelor/Doctoral Program of Beijing Institute of Technology, but now, I am a Computer Science master candidate who know just a little bit more than those sophomore students in Computer Science. I probably have to treat my two years here as another compressed "undergraduate" studying, since I must fill up my gap in CS. It is a really hard turn to me, and I am still trying to get it, even though Computer Science has been my interested major for more than five years and I was really enjoying having fun with it. Studying as a major is a very different thing. I must take a risk after the realization of my dream of formally studying Computer Science.

Personally, I have not been ready to the new period of my life. I am still thinking about my close friends everyday, and I even have a lot of plans or goals, even relationship, remained unsolved after the sudden end of my undergraduate years. I don't want to talk about BIT anymore.

As for long-term, I had not got chance to think thoroughly about my future and career due to the interference from my family and hard workload due to study. Now, I have some, and I am going to make resolutions in my next post for my new semester as well as my graduate study in Brown.


















下午是高飞教授的C语言课,出乎我们预料的是高教授居然是女的……课堂上还有一位留学生,我当时比较羞涩,不过班里似乎有同学跟他关系搞得不错。还有几个上一届的来听课或者通过某种途径选上这门课的。这门课的上机是要在一个类似于OJ (Online Judge) 的系统上完成的,作为资深NOIP选手和资深NOI选拔赛选手,这对我当然不在话下。于是我吭哧吭哧地很快搞定了所有的上机作业题……然后飞设他们班的葛同学就主动认识我,然后给我起了“大神”这个外号。后来证明葛同学是他们班的大神。









我们学院还开了一个专业导论课,是我们学院上下两届本硕博班一起上的,于是我有机会认识上一届的学长和学院这帮老师们。第一节课下课后,我在教室里坐着等食堂人少一点儿再去吃午饭,教室里还有睿恒和一个学长凯成,于是我们跟学长聊起来,我便认识了他。再到后来寒假一起参加数学建模,我就又认识了晶阳学长等等。导论课内容还是比较有趣的,至少我大概知道学院都是在干些啥的了,我之前到处听讲座积累的知识面这时候发挥了作用,它们帮助我很快地接受了这些新的东西。不过当时好像只有我在认真听这个课…… 当时我还在犹豫是不是转学去滑铁卢大学读计算机,所以这门课也对我用处很大。






























  • 跟同学讨论聚餐)我就打算开学再去了。我现在还有点酒精中毒的样子,意识还有点模糊……等我好受点再聊吧……现在浑身痉挛……
  • 激活学校发的银行卡)欢迎使用××银行网上个人银行专业版,请登录下载专业版并安装激活,激活授权码见申请回执,请妥善保存,切莫告知他人。[××银行]
  • 我去北理啊...你觉得还能去哪啊...?应该是去信息工程...我还没想好呢...还得再考虑考虑...
  • 没…我也想要有意思的事情…他们聊我被孤立~呃我还没找着北京的呢
  • 班主任通知开第一次会)九点半,体检后在校医院门口集合,请转告别的同学!
  • 给高中班主任说一说现在的情况)好孩子啊~好好学习,心态要好,将来的结果跟学校如何没有绝对关系,主要还看自己的本事…加油!
  • 怪不得高中不让用电~刚洗澡去了一个人都没有,因为他们打疫苗北京的不用~
  • 不是跟你说了...我们要排练晚会内容...然后就不用去了...还能拿到90多分的成绩啊~哦哈哈哈~
  • 体检先交费,105,大家速度;12:45图书馆旁广场集合领军装,互相提醒
  • 宏伟告诉我军训宣传部门招募信息)我是宏伟,13点去综合教学楼205教室(学生电视中心)。以后多多联系!
  • 认识了阳光辅导员昊哥)好的,以后多沟通:)
  • 跟军训指导员搭讪)你好,头发长度不要超过你的手指直径宽度 。衣服晚上六点半可以补领
  • 给辅导员反映附近建筑噪音的问题)好的,先转告大家克服一下,霍老师
  • 入选军训宣传部门)这里是电视中心编辑部,恭喜你入选北京理工大学学生电视中心,并成为其中的精英加入本学期军训摄制组,我们将于今日起(29日)进行集体培训,地点为良A205(今天面试地点),时间晚8点,请尽量带电脑。收到请回复。
  • 吐槽——他后来出国了,这也是我三年后在国外才知道)传祺:成都的大学太差了!
  • 老白赴美了——三年后我居然阴差阳错的去了他们学校交换)委座电,明日正午赴美,此号保留,欢迎校内、邮箱联系,各位珍重!特封你为党国北平站电讯处处长,兼良乡乡长,好好干!
  • 北大的同学吐槽课表)数分,高代,力学,电磁学,计算基础,程序设计……
  • 晨:通知:从今天开始你们正式被北京理工大学指挥部征调,理论课及考试必须正常上,其实时间归指挥部统一调配,今天中午十一点五十必须在指挥部集合(图书馆后门进入)
  • 军训初期
    • ——呃……宿舍一人看上一个女的……还有一帮人在聊游戏,你说我咋办?他们商量要四点起来……
    • ——这就是典型的欠练…过两天觉得没这精神头了~你让他们悠着点
  • 喔……你什么专业来着?我在弄托福……想一起背单词吗?
  • 班主任:各位同学好!我中午时会到你们宿舍收一下材料,请大家作好准备。请相关的同学准备以下三种材料:已收到的更改后的户口迁移卡、上次没有交的新生信息表、家庭经济情况调查表。谢谢!
  • 班主任:明天是大家离家进京求学后的第一个中秋节了,在这样一个团圆节日里,大家难免有独在异乡的感觉吧?不妨就把老师和身边的同学当成家人吧!老师祝你们的大学生涯象十五的月亮一样圆满!生活象月光一样绵长!心灵象秋天的夜空一样深邃宁静!别忘了跟家人通个电话,休息外出时注意安全!