- support asynchronous operation -- add a per-fs 'reserved_space' count,
let each outstanding write reserve the _maximum_ amount of physical
space it could take. Let GC flush the outstanding writes because the
reservations will necessarily be pessimistic. With this we could even
do shared writable mmap, if we can have a fs hook for do_wp_page() to
make the reservation.
- disable compression in commit_write()?
- fine-tune the allocation / GC thresholds
- chattr support - turning on/off and tuning compression per-inode
- checkpointing (do we need this? scan is quite fast)
- make the scan code populate real inodes so read_inode just after
mount doesn't have to read the flash twice for large files.
Make this a per-inode option, changeable with chattr, so you can
decide which inodes should be in-core immediately after mount.
- test, test, test
- NAND flash support:
- almost done :)
- use bad block check instead of the hardwired byte check
- Optimisations:
- Split writes so they go to two separate blocks rather than just c->nextblock.
By writing _new_ nodes to one block, and garbage-collected REF_PRISTINE
nodes to a different one, we can separate clean nodes from those which
are likely to become dirty, and end up with blocks which are each far
closer to 100% or 0% clean, hence speeding up later GC progress dramatically.
- Stop keeping name in-core with struct jffs2_full_dirent. If we keep the hash in
the full dirent, we only need to go to the flash in lookup() when we think we've
got a match, and in readdir().
- Doubly-linked next_in_ino list to allow us to free obsoleted raw_node_refs immediately?
- Remove size from jffs2_raw_node_frag.
dedekind:
1. __jffs2_flush_wbuf() has a strange 'pad' parameter. Eliminate.
2. get_sb()->build_fs()->scan() path... Why get_sb() removes scan()'s crap in
case of failure? scan() does not clean everything. Fix.
/cgit.cgi/linux/net-next.git/commit/include/drm/drm_property.h?id=966d2b04e070bc040319aaebfec09e0144dc3341'>commitdiff
percpu-refcount: fix reference leak during percpu-atomic transition
percpu_ref_tryget() and percpu_ref_tryget_live() should return
"true" IFF they acquire a reference. But the return value from
atomic_long_inc_not_zero() is a long and may have high bits set,
e.g. PERCPU_COUNT_BIAS, and the return value of the tryget routines
is bool so the reference may actually be acquired but the routines
return "false" which results in a reference leak since the caller
assumes it does not need to do a corresponding percpu_ref_put().
This was seen when performing CPU hotplug during I/O, as hangs in
blk_mq_freeze_queue_wait where percpu_ref_kill (blk_mq_freeze_queue_start)
raced with percpu_ref_tryget (blk_mq_timeout_work).
Sample stack trace:
__switch_to+0x2c0/0x450
__schedule+0x2f8/0x970
schedule+0x48/0xc0
blk_mq_freeze_queue_wait+0x94/0x120
blk_mq_queue_reinit_work+0xb8/0x180
blk_mq_queue_reinit_prepare+0x84/0xa0
cpuhp_invoke_callback+0x17c/0x600
cpuhp_up_callbacks+0x58/0x150
_cpu_up+0xf0/0x1c0
do_cpu_up+0x120/0x150
cpu_subsys_online+0x64/0xe0
device_online+0xb4/0x120
online_store+0xb4/0xc0
dev_attr_store+0x68/0xa0
sysfs_kf_write+0x80/0xb0
kernfs_fop_write+0x17c/0x250
__vfs_write+0x6c/0x1e0
vfs_write+0xd0/0x270
SyS_write+0x6c/0x110
system_call+0x38/0xe0
Examination of the queue showed a single reference (no PERCPU_COUNT_BIAS,
and __PERCPU_REF_DEAD, __PERCPU_REF_ATOMIC set) and no requests.
However, conditions at the time of the race are count of PERCPU_COUNT_BIAS + 0
and __PERCPU_REF_DEAD and __PERCPU_REF_ATOMIC set.
The fix is to make the tryget routines use an actual boolean internally instead
of the atomic long result truncated to a int.
Fixes: e625305b3907 percpu-refcount: make percpu_ref based on longs instead of ints
Link: https://bugzilla.kernel.org/show_bug.cgi?id=190751
Signed-off-by: Douglas Miller <dougmill@linux.vnet.ibm.com>
Reviewed-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: e625305b3907 ("percpu-refcount: make percpu_ref based on longs instead of ints")
Cc: stable@vger.kernel.org # v3.18+